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PROTEIN SCAPFOLDS FOR ANTTBODV MTMTC.S 
5 AND OTHFR BINDING PROTFTN.S 

Background of the Invention 
This invention relates to protein scaffolds useful, for example, for the 
ircncration of products having novel binding characteristics. 

Proteins having relatively defined three-dimensional structures, 

10 ci>mmi>nl\ referred to as protein scaffolds, may be used as reagents for the 

dc>rjn i>i cTvjincered products. These scaffolds typically contain one or more 
rc^Mi>ns u hich are amenable to specific or random sequence variation, and such 
sequence randomization is often carried out to produce libraries of proteins 
from u hich desired products may be selected. One particular area in which 

15 such scal folds are useful is the field of antibody design. 

A number of previous approaches to the manipulation of the 
mammalian immune system to obtain reagents or dmgs have been attempted. 
These have included injecting animals with antigens of interest to obtain 
mixtures of polyclonal antibodies reactive against specific antigens, production 

20 of monoclonal antibodies in hybridoma cell culture (Koehler and Milstein, 
Nature 256:495, 1975), modification of existing monoclonal antibodies to 
obtain new or optimized recognition properties, creation of novel antibody 
fragments with desirable binding characteristics, and randomization of single 
chain antibodies (created by connecting the variable regions of the heavy and 

25 light chains of antibody molecules with a flexible peptide linker) followed by 
selection for antigen binding by phage display (Clackson et al., Nature 352:624, 
1991). 

In addition, several non-immunoglobulin protein scaffolds have been 
proposed for obtaining proteins with novel binding properties. For example, a 
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"minibody" scaffold, which is related to the immunoglobulin fold, has been 
designed by deleting three beta strands from a heavy chain variable domain of a 
monoclonal antibody (Tramontane et al., J. Mol. Recognit. 7:9, 1994). This 
protein includes 61 residues and can be used to present two hypervariable 
5 loops. These two loops have been randomized and products selected for 

antigen binding, but thus far the framework appears to have somewhat limited 
utility due to solubility problems. Another framework used to display loops has 
been tendamistat, a 74 residue, six-strand beta sheet sandwich held together by 
two disulfide bonds (McConnell and Hoess, J. Mol. Biol. 250:460, 1995). This ^ 

10 scaffold includes three loops, but, to date, only two of theseloops have been 
examined for randomization potential. 

Other proteins have been tested as frameworks and have been used to 
display randomized residues on alpha helical surfaces (Nord et al., Nat. 
Biotechnol. 15:772, 1997; Nord et al.. Protein Eng. 8:601, 1995), loops 

1 5 between alpha helices in alpha helix bundles (Ku and Schultz, Proc. Natl. Acad. 
Sci. USA 92:6552, 1995), and loops constrained by disulfide bridges, such as 
those of the small protease inhibitors (Markland et al., Biochemistry 35:8045, 
1996; Markland et al., Biochemistry 35:8058, 1996; Rottgen and Collins, Gene 
164:243, 1995; Wang et al., J. Biol. Chem. 270:12250, 1995). 

20 Summary of the Tnventinn 

The present invention provides a new family of proteins capable of 
evolving to bind any compound of interest. These proteins, which make use of 
a fibronectin or fibronectin-like scaffold, function in a maimer characteristic of 
natural or engineered antibodies (that is, polyclonal', monoclonal, or 

25 single-chain antibodies) and, in addition, possess structural advantages. 
Specifically, the structure of these antibody mimics has been designed for 
optimal folding, stability, and solubility, even under conditions which normally 
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lead to the loss of structure and function in antibodies. 

These antibody mimics may be utilized for the purpose of designing 
proteins which are capable of binding to virtually any compound (for example, 
any protein) of interest. In particular, the fibronectin-based molecules 
5 described herein may be used as scaffolds which are subjected to directed 
evolution designed to randomize one or more of the three fibronectin loops 
which are analogous to the complementarity-determining regions (CDRs) of an 
antibody variable region. Such a directed evolution approach results in the 
production of antibody-like molecules with high affinities for antigens of 

10 interest. In addition, the scaffolds described herein may be used to display 

defined exposed loops (for example, loops previously randomized and selected 
on the basis of antigen binding) in order to direct the evolution of molecules * 
that bind to such introduced loops. A selection of this type may be carried out 
to identify recognition molecules for any individual CDR-like loop or, 

15 alternatively, for the recognition of two or all three CDR-like loops combined 
into a non-linear epitope. 

Accordingly, the present invention features a protein that includes a 
fibronectin type III domain having at least one randomized loop, the protein 
being characterized by its ability to bind to a compound that is not bound by the 

20 corresponding naturally-occumng fibronectin. 

In preferred embodiments, the fibronectin type III domain is a 
mammalian (for example, a human) fibronectin type III domain; and the protein 
includes the tenth module of the fibronectin type III ('*^Fn3) domain. In such 
proteins, compound binding is preferably mediated by either one, two, or three 

25 *°Fn3 loops. In other preferred embodiments, the second loop of ***Fn3 may be 
extended in length relative to the naturally-occurring module, or the ^°Fn3 may 
lack an integrin-binding motif In these molecules, the integrin-binding motif 
may be replaced by an amino acid sequence in which a basic amino acid- 
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neutral amino acid-acidic amino acid sequence (in the N-terminal to C-terminal 
direction) replaces the integrin-binding motif; one preferred sequence is serine- 
glycine-glutamate. In another preferred embodiment, the fibronectin type III 
domain-containing proteins of the invention lack disulfide bonds. 

Any of the fibronectin type II domain-containing proteins described 
herein may be formulated as part of a fiision protein (for example, a fiision 
prmcin w hich further includes an immunoglobulin domain, a complement 
proii^-in. a to\in protein, or an albumin protein). In addition, any of the 
rihrt>noctin t\pc III domain proteins may be covalently bound to a nucleic acid 
1 0 (Tor example, an RNA), and the nucleic acid may encode the protein. 

M.>reo\ er. the protein may be a multimer, or, particularly if it lacks an integrin- 
bindmi; motif, it may be formulated in a physiologically-acceptable carrier. 

I he present invention also includes features proteins that include a 
fibronectm i\ pc III domain having at least one mutation in a P-sheet sequence 
1 5 w hich changes the scaffold structure. Again, these proteins are characterized 
by their ability to bind to compound that are not bound by the corresponding 
naturally-occurring fibronectin. 

In a related aspect, the invention further features nucleic acids 
encoding any of the proteins of the invention. In preferred embodiments, the 
20 nucleic acid is DNA or RNA. 

In another related aspect, the invention also features a method for 
generating a protein which includes a fibronectin type III domain and which is 
pharmaceutically acceptable to a mammal, involving removing the integrin- 
binding domain of said fibronectin type III domain. This method may be 
applied to any of the fibronectin type 111 domain-containing proteins described 
above and is particularly useful for generating proteins for human therapeutic 
applications. The invention also features such fibronectin type III domain- 
containing proteins which lack integrin-binding domains. 
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In yet other related aspects, the invention features screening methods 
which may be used to obtain or evolve randomized fibronectin type III proteins 
capable of binding to compounds of interest, or to obtain or evolve compounds 
(for example, proteins) capable of binding to a particular protein containing a 
5 randomized fibronectin type III motif. In addition, the invention features 

screening procedures which combine these two methods, in any order, to obtain 
either compounds or proteins of interest. 

In particular, the first screening method, useful for the isolation or 
identification of randomized proteins of interest, involves : (a) contacting the 
10 compound with a candidate protein, the candidate protein including a 

fibronectin type III domain having at least one randomized loop, the contacting 
being carried out under conditions that allow compound-protein complex 
formation; and (b) obtaining, ft-om the complex, the protein which binds to the 
compound. 

1 5 The second screening method, for isolating or identifying a 

compound which binds to a protein having a randomized fibronectin type III 
domain, involves: (a) contacting the protein with a candidate compound, 
the contacting being carried out under conditions that allow compound-protein 
complex formation; and (b) obtaining, from the complex, the compound which 

20 binds to the protein. 

In preferred embodiments, the methods further involve either 
randomizing at least one loop of the fibronectin type III domain of the protein 
obtained in step (b) and repeating steps (a) and (b) using the further randomized 
protein, or modifying the compound obtained in step (b) and repeating steps (a) 

25 and (b) using the further modified compound. In addition, the compound is 
preferably a protein, and the fibronectin type III domain is preferably a 
mammalian (for example, a hurrian) fibronectin type III domain. In other 
preferred embodiments, the protein includes the tenth module of the fibronectin 
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type III domain (*^n3), and binding is mediated by one, two or three *^Fn3 
loops. In addition, the second loop of '^Fn3 may be extended in length relative 
to the naturally-occurring module, or *^Fn3 may lack an integrin-binding motif. 
Again, as described above, the integrin-binding motif may be replaced by an 
amino acid sequence in which a basic amino acid-neutral amino acid-acidic 
amino acid sequence (in the N-terminal to C-terminal direction) replaces the 
integrin-binding motif; one preferred sequence is serine-glycine-glutamate. 

The selection methods described herein may be carried out using any 
fibronectin type III domain-containing protein. For example, the fibronectin 
type III domain-containing protein may lack disulfide bonds, or may be 
formulated as part of a fusion protein (for example, a fusion protein which 
further includes an immunoglobulin F^ domain, a complement protein, a toxin 
protein, or an albumin protein). In addition, selections may be carried out using 
the fibronectin type III domain proteins covalently bound to nucleic acids (for 
example, RNAs or any nucleic acid v/hich encodes the protein). Moreover, the 
selections may be carried out using fibronectin domain-containing protein 
multimers. 

Preferably, the selections involve the immobilization of the binding 
target on a solid support. Preferred solid supports include columns (for 
example, affinity columns, such as agarose columns) or microchips. 

As used herein, by "fibronectin type III domain" is meant a domain 
having 7 or 8 beta strands which are distributed between two beta sheets, which 
themselves pack against each other to form the core of the protein, and further 
containing loops which connect the beta strands to each other and are solvent 
exposed. There are at least three such loops at each edge of the beta sheet 
sandwich, where the edge is the boundary of the protein perpendicular to the 
direction of the beta strands. Preferably, a fibronectin type III domain includes 
a sequence which exhibits at least 30% amino acid identity, and preferably at 
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least 50% amino acid identity, to the sequence encoding the structure of the 
*^Fn3 domain referred to as "Ittg" (ID = "Ittg" (one ttg)) available from the 
Protein Data Base. Sequence identity referred to in this definition is 
determined by the Homology program, available from Molecular Simulation 
5 (San Diego, CA). The invention ftirther includes polymers of ^*^Fn3 -related 

molecules, which are an extension of the use of the monomer structure, whether 
or not the subunits of the polyprotein are identical or different in sequence. 

By "naturally occurring fibronectin" is meant any fibronectin protein 
that is encoded by a living organism. 
10 By "randomized" is meant including one or more amino acid 

alterations relative to a template sequence. 

By a "protein" is meant any sequence of two or more amino acids, ^ 
regardless of length, post-translation modification, or fiinction. "Protein" and ^ 
"peptide" are used interchangeably herein. . . - 

1 5 By "RNA" is meant a sequence of two or more covalently bonded, . 

naturally occurring or modified ribonucleotides. One example of a modified ^ 
RNA included within this term is phosphorothioate RNA. 

By "DNA" is meant a sequence of two or more covalently bonded, 
naturally occurring or modified deoxyribonucleotides. 
20 By a "nucleic acid" is meant any two or more covalently bonded 

nucleotides or nucleofide analogs or derivadves. As used herein, this term . 
includes, without limitafion, DNA, RNA, and PNA. 

By "pharmaceutically acceptable" is meant a compound or protein 
that may be administered to an animal ^for example, a mammal) without 
25 significant adverse medical consequences. 

By "physiologically acceptable carrier" is meant a carrier which does 
not have a significant detrimental impact on the treated host and which retains 
the therapeutic properties of the compound with which it is administered. One 
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exemplary physiologically acceptable carrier is physiological saline. Other 
physiologically acceptable carriers and their formulations are known to one 
skilled in the art and are described, for example, in Reming ton'^ 
Pharmamirical Srif^nr rs (IS'" edition), ed. A. Gennaro, 1990, Mack Publishing 
Company, Easton, PA, incorporated herein by reference. 

By "selecting" is meant substantially partitioning a molecule from 
other molecules in a population. As used herein, a "selecting" step provides at 
least a 2-fold, preferably, a 30-foId, more preferably, a 100-fold, and, most 
preferably, a 1000-fold enrichment of a desired molecule relative to undesired - 
- molecules-in--a-pop-uraTion follow^^ 
repeated any number of times, and different types of selection steps may be 
combined in a given approach. 

By "binding partner," as used herein, is meant any molecule which - 
has a specific, covalent or non-covalent affinity for a portion of a desired 
compound (for example, protein) of interest. Examples of binding partners 
include, without limitation, members of antigen/antibody pairs, 
protein/inhibitor pairs, receptor/ligand pairs (for example cell surface 
receptor/ligand pairs, such as hormone receptor/peptide honnone pairs), 
enzyme/substrate pairs (for example, kinase/substrate pairs), 
lectin/carbohydrate pairs, oligomenc or heterooligomeric protein aggregates, 
DNA binding protein/DNA binding site pairs, RNA/protein pairs, and nucleic 
acid duplexes, heteroduplexes, or ligated strands, as well as any molecule 
which is capable of forming one or more covalent or non-covalent bonds (for 
example, disulfide bonds) with any portion of another molecule (for example, a 
compound or protein). 

By a "solid support" is meant, without limitation, any column (or 
column material), bead, test tube, microliter dish, solid particle (for example, 
agarose or sepharose), microchip (for example, silicon, silicon-glass, or gold 
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chip), or membrane (for example, the membrane of a liposome or vesicle) to 
which an affinity complex may be bound, either directly or indirectly (for 
example, through other binding partner intermediates such as other antibodies 
or Protein A), or in which an affinity complex may be embedded (for example, 
through a receptor or channel). 

The present invention provides a number of advantages. For 
example, as described in more detail below, the present antibody mimics 
exhibit improved biophysical properties, such as stability under reducing 
conditions and solubility at high concentrations. In addition, these molecules 
may be readily expressed and folded in prokaryotic systems, such as E. ££2li, in 
eukaryotic systems, such as yeast, and in in vitro translation systems, such as 
the rabbit reticulocyte lysate system. Moreover, these molecules are extremely 
amenable to affinity maturation techniques involving multiple cycles of 
selection, including in vitro selection using RNA-protein fusion technology 
(Roberts and Szostak, Proc. Natl. Acad. Sci USA 94:12297, 1997; Szostak et 
al., U.S.S.N. 09/007,005 and U.S.S.N. 09/247,190; Szostak et al. 
WO98/31700), phage display (see, for example, Smith and Petrenko, Chem. 
Rev. 97:317, 1997), and yeast display systems (see, for example, Boder and 
Wittrup, Nature Biotech. 15:553, 1997). 

Other features and advantages of the present invention will be 
apparent fi-om the following detailed description thereof, and fi-om the claims. 

. Brief Description of the Drawings 
FIGURE 1 is a photograph showing a comparison between the 

structures of antibody heavy chain variable regions firom camel (dark blue) and 

llama (light blue), in each of two orientations. 

FIGURE 2 is a photograph showing a comparison between the 
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structures of the camel antibody heavy chain variable region (dark blue), the 
llama antibody heavy chain variable region (light blue), and a fibronectin type 
III module number 1 0 (**^Fn3) (yellow). 

FIGURE 3 is a photograph showing a fibronectin type III module 
5 number 10 ('*^Fn3), with the loops corresponding to the antigen-binding loops 
in IgG heavy chains highlighted in red. 

FIGURE 4 is a graph illustrating a sequence alignment between a 
fibronectin type III protein domain and related protein domains. 

FIGURE 5 is a photograph showing the structural similarities 
10 between a *^Fn3 domain and 15 related proteins, including fibronectins, 

tenascins, coUagens, and undulin. In this photograph, the regions are labeled as 
follows: constant, dark blue; conserved, light blue; neutral, white; variable, red; 
and RGB integrin-binding motif (variable), yellow. 

FIGURE 6 is a photograph showing space filling models of 
15 fibronectin III modules 9 and 10, in each of two different orientations. The two 
modules and the integrin binding loop (RGB) are labeled. In this figure, blue 
indicates positively charged residues, red indicates negatively charged residues, 
and white indicates uncharged residues. 

FIGURE 7 is a photograph showing space filling models of 
20 fibronectin III modules 7-10, in each of three different orientiations. The four 
modules are labeled. In this figure, blue indicates positively charged residues, 
red indicates negatively charged residues, and white indicates uncharged 
residues. 

FIGURE 8 is a photograph illustrating the formation, under different 
25 salt conditions, of RNA-protein fusions which include fibronectin type III 
domains. 

FIGURE 9 is a series of photographs illustrating the selection of 
fibronectin type III domain-containing RNA-protein fusions, as measured by 
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PGR signal analysis. 

FIGURE 10 is a graph illustrating an increase in the percent TNF-a 
binding during the selections described herein, as well as a comparison between 
RNA-protein fusion and free protein selections. 
5 FIGURE 1 1 is a series ox" schematic representations showing IgG, 

^*^Fn3, Fn-CHi-CH2-CH3, and Fn-CH.-CHj (clockwise from top left). 

FIGURE 12 is a photograph showing a molecular model of Fn-CH^- 
CH2-CH3 based on known three-dimensional structures of IgG (X-ray 
crystallography) and ^^Fn3 (NMR and X-ray crystallography). 



10 Detai led Description 

The novel antibody mimics described herein have been designed to 
be superior both to antibody-derived fragments and to non-antibody 
frameworks, for example, those frameworks described above. 

The major advantage of these antibody mimics over antibody 

1 5 fragments is structural. These scaffolds are derived from whole, stable, and 
soluble structural modules found in human body fluid proteins. Consequently, 
they exhibit better folding and thermostability properties than antibody 
fragments, whose creation involves the removal of parts of the antibody native 
fold, often exposing amino acid residues that, in an intact antibody, would be 

20 buried in a hydrophobic environment, such as an interface between variable and 
constant domains. Exposure of such hydrophobic residues to solvent increases 
the likelihood of aggregation. 

In addition, the antibody mimics described herein have no disulfide 
bonds, which have been reported to retard or prevent proper folding of antibody 

25 fragments under certain conditions. Since the present scaffolds do not rely on 
disulfides for native fold stability, they are stable under reducing conditions, 
unlike antibodies and their fragments which unravel upon disulfide bond 
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breakdown. 

Moreover, these fibronectin-based scaffolds provide the functional 

advantages of antibody molecules. In particular, despite the fact that the '°Fn3 

module is not an immunoglobulin, its overall fold is close to that of the variable 

region of the IgG heavy chain (Figure 2), making it possible to display the three 

fibronectin loops analogous to CDRs in relative orientations similar to those of 

native antibodies. Because of this structure, the present antibody mimics 

possess antigen binding properties' that are similar in nature and affinity to 

those of antibodies, and a loop randomization and shuffling strategy may be 

- employed in idlm that is-siffiila-r to-the processVf affmity matiiratlon ~ 

antibodies in vivo . 

There are now described below exemplary fibronectin-based 
scaffolds and their use for identifying, selecting, and evolving novel binding 
proteins as well as their target ligands. These examples are provided for the 
purpose of illustrating, and not limiting, the invention. 

■^Fn3 StniptiiT-fll MfTtlf 

The antibody mimics of the present invention are based on the 
structure of a fibronectin module of type III (Fn3), a common domain found in 
mammalian blood and structural proteins. This domain occurs more than 400 
times in the protein sequence database and has been estimated to occur in 2% 
of the proteins sequenced to date, including fibronectins, tenscin, intracellular 
cytoskeletal proteins, and prokaryotic enzymes (Bork and Doolittle, Proc. Natl. 
Acad. Sci. USA 89:8990, 1992; Bork et al.. Nature Biotech. 15:553, 1997; 
Meinke et al., J. Bacteriol. 175:1910, 1993; Watanabe et al., J. Biol. Chem. 
265:15659, 1990). In particular, these scaffolds include, as templates, the tenth 
module of human Fn3 ('0Fn3), which comprises 94 amino acid residues. The 
overall fold of this domain is closely related to that of the smallest functional 
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antibody fragment, the variable region of the heavy chain, which comprises the 
entire antigen recognition unit in camel and llama IgG (Figure 1, 2). The major 
difTcrcnccs between camel and llama domains and the "^FnS domain are that (i) 
"'Fn3 has fewer beta strands (seven vs. nine) and (ii) the two beta sheets packed 
against each other are connected by a disulfide bridge in the camel and llama 
domains, but not in '°Fn3. 

The three loops of '°Fn3 corresponding to the antigen-binding loops 
c»l ihc heavy chain run between amino acid residues 21-31, 51-56, and 

vs ( Fitrure 3). The length of the first and the third loop, 1 1 and 12 residues, 
rcN(x\. iiv cl> . full within the range of the corresponding antigen-recognition 
Kn.ps found m anfibody heavy chains, that is, 10-12 and 3-25 residues, 
rcspccti\ cl\ Accordingly, once randomized and selected for high antigen 
afrinity, these two loops make contacts with antigens equivalent to the contacts 
of the corresponding loops in antibodies. 

In contrast, the second loop of '°Fn3 is only 6 residues long, whereas 
the corresponding loop in antibody heavy chains ranges from 16-19 residues. 
To optmii/c antigen binding, therefore, the second loop of '°Fn3 is preferably 
extended by 10-13 residues (in addition to being randomized) to obtain the 
greatest possible flexibility and affinity in antigen binding. Indeed, in general, 
the lengths as well as the sequences of the CDR-like loops of the antibody 
mimics may be randomized during in vitro or in vivo affinity maturation (as 
described in more detail below). 

The tenth human fibroncctin type III domain, '**Fn3, refolds rapidly 
even at low temperature; its backbone conformation has been recovered within 
1 second at S^C. Thermodynamic stability of '°Fn3 is high (AG^ = 24 kJ/mol = 
5.7 kcal/mol), correlating with its high melting temperature of 1 10°C. 

One of the physiological roles of '°Fn3 is as a subunit of fibronectin, 
a glycoprotein that exists in a soluble form in body fluids and in an insoluble 
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form in the extracellular matrix (Dickinson et aL, J. Mol. Biol. 236:1079, 
1994). A fibronectin monomer of 220-250 kD contains 12 type I modules, two 
type II modules, and 17 fibronectin type III modules (Potts and Campbell, Curr. 
Opin.Cell Biol. 6:648, 1994). Different type III modules are involved in the 
5 binding of fibronectin to integrins, heparin, and chondroitin sulfate. **^Fn3 was 
found to mediate cell adhesion through an integrin-binding Arg-Gly-Asp 
(RGD) motif on one of its exposed loops. Similar RGD motifs have been 
shown to be involved in integrin binding by other proteins, such as fibrinogen, 
von Wellebrand factor, and vitronectin (Hynes et al.. Cell 69:1 1, 1992). No 

10 other matrix- of cell-bindihgTdles have bee^ described f6r~*^Fh3~ 

The observation that '*^Fn3 has only slightly more adhesive activity 
than a short peptide containing RGD is consistent with the conclusion that the 
cell-binding activity of ^^Fn3 is localized in the RGD peptide rather than 
distributed throughout the ^^Fn3 structure (Baron et al.. Biochemistry 31:2068, 

15 1992). The fact that '^Fn3 without the RGD motif is unlikely to bind to other 
plasma proteins or extracellular matrix makes '^Fn3 a useful scaffold to replace 
antibodies. In addition, the presence of '^Fn3 in natural fibrinogen in the 
bloodstream suggests that ^°Fn3 itself is unlikely to be immunogenic in the 
organism of origin. 

20 In addition, we have determined that the ^^Fn3 framework possesses 

exposed loop sequences tolerant of randomization, facilitating the generation of 
diverse pools of antibody mimics. This determination was made by examining 
the flexibility of the *^Fn3 sequence. In particular, the human ^^Fn3 sequence 
was aligned with the sequences of fibronectins from other sources as well as 

25 sequences of related proteins (Figure 4), and the results of this alignment were 
mapped onto the three-dimensional structure of the human *^Fn3 domain 
(Figure 5). This alignment revealed that the majority of conserved residues are 
found in the core of the beta sheet sandwich, whereas the highly variable 
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residues are located along the edges of the beta sheets, including the N- and 
C-termini, on the solvent-accessible faces of both beta sheets, and on three 
solvent-accessible loops that serve as the hypervariable loops for affinity 
maturation of the antibody mimics. In view of these results, the randomization 
5 of these three loops are unlikely to have an adverse effect on the overall fold or 
stability of the **^Fn3 framework itself. 

For the human **^Fn3 sequence, this analysis indicates that, at a 
minimum, amino acids 1-9, 44-50, 61-54, 82-94 (edges of beta sheets); 19, 21, 
30-46 (even), 79-65 (odd) (solvent-accessible faces of both beta sheets); 21-31, 

10 51-56, 76-88 (CDR-Hke solvent- accessible loops); and 14-16 and 36-45 (other 
solvent-accessible loops and beta turns) may be randomized to evolve new or 
improved compound-binding proteins. In addition, as discussed above; t 
alterations in the lengths of one or more solvent exposed loops may also be 
included in such directed evolution methods. Altematively, changes in the p- 

15 sheet sequences may also be used to evolve new proteins. These mutations 
change the scaffold and thereby indirectly alter loop structure(s). If this 
approach is taken, mutations should not saturate the sequence, but rather few 
mutations should be introduced. Preferably, no more than 10 amino acid 
changes, and, more preferably, no more than 3 amino acid changes should be 

20 introduced to the p-sheet sequences by this approach. 

Fibronectin Fusions 

The antibody mimics described herein may be fiised to other protein 
domains. For example, these mimics may be integrated with the human 
immune response by fusing the constant region of an IgG (F^) with a *^Fn3 
25 module, preferably through the C-terminus of **^Fn3. The F^ in such a *^Fn3-F, 
fusion molecule activates the complement component of the immune response 
and increases the therapeutic value of the antibody mimic. Similarly, a fusion 
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between '^Fn3 and a complement protein, such as Clq, may be used to target 
cells, and a fusion between *^Fn3 and a toxin may be used to specifically 
destroy cells that carry a particular antigen. In addition, *^Fn3 in any form may 
be fused with albumin to increase its half-life in the bloodstream and its tissue 
5 penetration. Any of these fusions may be generated by standard techniques, for 
example, by expression of the fusion protein from a recombinant fusion gene 
constructed using publically available gene sequences. 

I iVroaccUa Scaffold Multimers 

lh~addif ion~to~fibfbhectihlTionomers^ any of the fibronectiir ^ 

10 ci>nsirucl!> described herein may be generated as dimers or multimers of 

" f n3 -based antibody mimics as a means to increase the valency and thus the 
j\ idiiy of antigen binding. Such multimers may be generated through covalent 
bindmg between individual *^Fn3 modules, for example, by imitating the 
natural *'Fn3-''Fn3-*°Fn3 C-to-N-terminus binding or by imitating antibody 

15 dimers that are held together through their constant regions. A ^^Fn3-Fc 
construct may be exploited to design dimers of the general scheme of 
'**Fn3-I-c::Fc-*^Fn3. The bonds engineered into the Fc::Fc interface may be 
covalent or non-covalent. In addition, dimerizing or multimerizing partners 
other than Fc can be used in '^Fn3 hybrids to create such higher order 

20 structures. 

In particular examples, covalently bonded multimers may be 
generated by consti-ucting fusion genes that encode the multimer or, 
alternatively, by engineering codons for cysteine residues into monomer 
sequences and allowing disulfide bond formation to occur between the 
25 expression products. Non-covalently bonded multimers may also be generated 
by a variety of techniques. These include the introduction, into monomer 
sequences, of codons corresponding to positively and/or negatively charged 
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residues and allowing interactions between these residues in the expression 
products (and therefore between the monomers) to occur. This approach may 
be simplified by taking advantage of charged residues naturally present in a 
monomer subunit, for example, the negatively charged residues of fibronectin. 

5 Another means for generating non-covalently bonded antibody mimics is to 
introduce, into the monomer gene (for example, at the amino- or carboxy- 
termini), the coding sequences for proteins or protein domains known to 
interact. Such proteins or protein domains include coil-coil motifs, leucine 
zipper motifs, and any of the numerous protein subunits (or fragments thereof) 

1 0 known to direct formation of dimers or higher order multimers. 

Fihronectin -T .ike Molecules 

Although '*^Fn3 represents a preferred scaffold for the generation of 
antibody mimics, other molecules may be substituted for *^Fn3 in the molecules 
described herein. These include, without limitation, human fibronectin 

1 5 modules 'Fn3-^Fn3 and ^ 'Fn3-'''Fn3 as well as related Fn3 modules from 

non-human animals and prokaryotes. In addition, Fn3 modules from other ^ 
proteins with sequence homology to '*^Fn3, such as tenascins and undulins, may 
also be used. Modules from different organisms and parent proteins may be 
most appropriate for different applications; for example, in designing an 

20 antibody mimic, it may be most desirable to generate that protein from a 
fibronectin or fibronectin-like molecule native to the organism for which a 
therapeutic or diagnostic molecule is intended. 

Directed Evolution of Scaffold-R asied Rinding Proteins 

The antibody mimics described herein may be used in any technique 
25 for evolving new or improved binding proteins. In one particular example, the 
target of binding is immobilized on a solid support, such as a column resin or 
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20 



microtiter plate well, and the target contacted with a library of candidate 
scaffold-based binding proteins. Such a library may consist of '«Fn3 clones 
constructed from the wild type ">Fn3 scaffold through randomization of the 
sequence and/or the length of the ">Fn3 CDR-like loops. If desired, this library 
may be an RNA-protein fusion library generated, for example, by the 
techniques described in Szostak et al., U.S.S.N. 09/007,005 and 09/247,190; 
Szostak et al., WO98/31700; and Roberts & Szostak, Proc. Natl. Acad. Sci.' 
USA (1997) vol. 94, p. 12297-12302. Alternatively, it may be a DNA-protein 
libraiy (for example, as described in Lohse, DNA-Protein Fusions and Uses 
Thereof, UtStS.N760/1 10,549, filed DecliSibe? 271998 and 



filed December 2, 1999). The fusion library is incubated with the immobilized 
target, the support is washed to remove non-specific binders, and the tightest 
binders are eluted under very stringent conditions and subjected to PGR to 
recover the sequence information or to create a new library of binders which 
1 5 may be used to repeat the selection process, with or without further 

mutagenesis of the sequence. A number of rounds of selection may be 
performed until binders of sufficient affinity for the antigen are obtained. 

In one particular example, the '»Fn3 scaffold may be used as the 
selection target For example, if a protein is required that binds a specific 
peptide sequence presented in a ten residue loop, a single '°Fn3 clone is 
constructed in which one of its loops has been set to the length often and to the 
desired sequence. The new clone is expressed in vivo and purified, and then 
immobilized on a solid support. An RNA-protein fiision libraiy based on an 
appropriate scaffold is then allowed to interact with the support, which is then 
25 washed, and desired molecules eluted and re-selected as described above. 

Similarly, the ">Fn3 scaffold may be used to find natural proteins that 
interact with the peptide sequence displayed in a '°Fn3 loop, The '«Fn3 protein 
is immobilized as described above, and an RNA-protein fusion library is 
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screened for binders to the displayed loop. The binders are enriched through 
multiple rounds of selection and identified by DNA sequencing. 

In addition, in the above approaches, although RNA-protein libraries 
represent exemplary libraries for directed evolution, any type of scaffold-based 
library may be used in the selection methods of the invention. 

Use 

The antibody mimics described herein may be evolved to bind any 
antigen of interest. These proteins have thermodynamic properties superior to 
those of natural antibodies and can be evolved rapidly in vitro Accordingly, 
these antibody mimics may be employed in place of antibodies in all areas in 
which antibodies are used, including in the research, therapeutic, and diagnostic ' 
fields. In addition, because these scaffolds possess solubility and stability 
properties superior to antibodies, the antibody mimics described herein may 
also be used under conditions which would destroy or inactivate antibody 
molecules. Finally, because the scaffolds of the present invention may be 
evolved to bind virtually any compound, these molecules provide completely 
novel binding proteins which also find use in the research, diagnostic, and 
therapeutic areas. 



Experimental Rp<:ii|ts 

Exemplary scaffold molecules described above were generated and 
tested, for example, in selection protocols, as follows. 

Library constnir.tinn 

A complex library was constructed fi-om three fragments, each of 
which contained one randomized area corresponding to a CDR-like loop. The 
fi-agments were named BC, DE, and FG, based on the names of the 
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CDR-H-like loops contained within them; in addition to ^^Fn3 and a 
randomized sequence, each of the fragments contained stretches encoding an 
N-terminal His^ domain or a C-terminal FLAG peptide tag. At each junction 
between two fragments (i.e., between the BC and DE fragments or between the 
5 DE and FG fragments), each DNA fragment contained recognition sequences 
for the Earl Type IIS restriction endonuclease. This restriction enzyme allowed 
the splicing together of adjacent fragments while removing all foreign, 
non-'^Fn3, sequences. It also allows for a recombination-like mixing of the 
three '^Fn3 fragments between cycles of mutagenesis and selection. 

10 Each'fragtnent'was as"sembred"from two^oveflappirfg 

oligonucleotides, which were first annealed, then extended to form the 
double-stranded DNA form of the fragment. The oligonucleotides that were 
used to construct and process the three fragments are listed below; the "Top" 
and "Bottom" species for each fragment are the oligonucleotides that contained 

15 the entire **^Fn3 encoding sequence. In these oligonucleotides designations, 
"N" indicates A, T, C, or G; and "S" indicates C or G. 

HfnLbcTop (His): 

5*- GG AAT TCC TAA TAG GAG TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA ATG CAT CAC CAT CAC CAT CAC GTT TCT GAT 
20 GTT CCG AGG GAG CTG GAA GTT GTT GCT GCG ACC CCC ACC 
AGC-3' (SEQ ID NO: 1) 

HfnLbcTop (an alternative N-terminus): 

5'- GG AAT TCC TAA TAG GAG TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA ATG GTT TCT GAT GTT CCG AGG GAG CTG GAA 
25 GTT GTT GCT GCG ACC CCC ACC AGC-3' (SEQ ID NO: 2) 
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HFnLBCBot-flagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CCC TGT TTC TCC GTA AGT GAT CCT GTA ATA TCT (SNN)7 CCA 
GCT GAT CAG TAG GCT GGT GGG GGT CGC AGC -3' (SEQ ID NO: 3) 

5 HFnBC3'-flag8: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CCC TGT TTC TCC GTA AGT GAT CC-3' (SEQ ID NO: 4) 

HFnLDETop: 

5'- GG AAT TCC TAA T AC G AC TCA CTA TAG GGA CAA TTA CTA • 
10 TTT AC A ATT AC A ATG CAT CAC CAT CAC CAT CAC CTC TTC AG A 
GGA GGA AAT AGC CCT GTC C-3' (SEQ ID NO: 5) - 

HFnLDEBot-flagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CGT ATA ATC AAC TCC AGG TTT AAG GCC GCT GAT GGT AGC TGT 
1 5 (SNN)4 AGG CAC AGT G AA CTC CTG GAC AGG GCT ATT TCC TCC 
TGT -3' (SEQ ID NO: 6) 

HFnDE3'-flag8: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CGT ATA ATC AAC TCC AGG TTT AAG G-3' (SEQ ID NO: 7) 

20 HFnLFGTop: 

5'- GG AAT TCC TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA ATG CAT CAC CAT CAC CAT CAC CTC TTC TAT 
ACC ATC ACT GTG TAT GCT GTC-3' (SEQ ID NO: 8) 
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HFnLFGBot-flagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC TGT TCG 
GTA ATT AAT GGA AAT TGG (SNN)IO AGT GAC AGC ATA CAC AGT 
GAT GGT ATA -3' (SEQ ID NO: 9) 

5 HFnFG3'-flag8: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC TGT TCG 
GTA ATT AAT GGA AAT TGG -3' (SEQ ID NO: 10) 

T7Tmv (intr6diices~T7'promot^r an^^ needed for 

in vitro translation): 

5'- GCG TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA TTT ACA 
ATT ACA-3' (SEQ ID NO: 11) 

ASAflagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC-3' (SEQ ID 
NO: 12) 

15 Unispl-s (spint oligonucleotide used to ligate mRNA to the 

puromycin-coiitaining linker, described by Roberts et al, 1997, supra): 

5'-TTTTTmTNAGCGGATGC-3' (SEQ ID NO: 13) 

A18— 2PEG (DNA-puroinycin linker): 

5'-(A) 1 8(PEG)2CCPur (SEQ ID NO: 1 4) 

20 The pairs of oligonucleotides (500 pmol of each) were annealed in 

100 of 10 mM Tris 7.5, 50 mM NaCl for 10 minutes at 85°C, followed by a 
slow (0.5-1 hour) cooling to room temperamre. The aimealed fragments with 
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single-stranded overhangs were then extended using 100 U Klenow (New 
England Biolabs, Beverly, MA) for each 100 (xL ahquot of annealed oligos, and 
the buffer made of 838.5 ^l H2O, 9 ^1 1 M Tris 7.5, 5 ^l IM MgCU, 20 ^il 10 
mM dNTPs, and 7.5 ^il IM DTT. The extension reactions proceeded for 1 hour 
5 at 25^C. 

Next, each of the double-stranded fragments was transformed into a 
RNA-protein fusion (PROfusion™) using the technique developed by Szostak 
et aL, U.S.S.N. 09/007,005 and U.S.S.N, 09/247,190; Szostak et al., 
WO98/31700; and Roberts & Szostak, Proc. Natl. Acad, Sci. USA (1997) vol. 
10 94, p. 12297-12302. Briefly, the fragments were transcribed using an Ambion 
in vitro transcription kit, MEGAshortscript (Ambion, Austin, TX), and the 
resulting mRNA was gel-purified and ligated to a DNA-puromycin linker using 
DNA ligase. The mRNA-DNA-puromycin molecule was then translated using 
the Ambion rabbit reticulocyte lysate-based translation kit. The resulting 
15 mRNA-DNA-puromycin-proiem PROfusion™ was purified using Oligo(dT) 
cellulose, and a complementary DNA strand was synthesized using reverse 
transcriptase and the RT primers described above (Unisplint-S or flagASA), 
following the manufacturer's instructions. 

The PROflision^'^ obtained for each fragment was next purified on 
20 the resin appropriate to its peptide purification tag, i.e., on Ni-NTA agarose for 
the His^-tag and M2 agarose for the FLAG-tag, following the procedure 
recommended by the manufacturer. The DNA comporient of the tag-binding 
PROfusions™ was amplified by PCR using Pharmacia Ready-to-Go PGR 
Beads, 10 pmol of 5* and 3* PCR primers, and the following PCR program 
25 (Pharmacia, Piscataway, NJ): Step 1 : 95X for 3 minutes; Step 2: 95°C for 30 
seconds, 58/62°C for 30 seconds, 72X for 1 minute, 20/25/30 cycles, as 
required; Step 3: 72'*C for 5 minutes; Step 4: 4°C until end. 

The resulting DNA was cleaved by 5 U Earl (New England Biolabs) 
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perl ug DNA; the reaction took place in T4 DNA Ligase Buffer (New England 
Biolabs) at 37°C, for 1 hour, and was followed by an incubation at 70°C fori 5 
minutes to inactivate Ear I. Equal amounts of the DC, DE, and FG fragments 
were combined and ligated to form a full-length '°Fn3 gene with randomized 
loops. The ligation required 10 U of fresh Earl (New England Biolabs) and 20 
U of T4 DNA Ligase (Promega, Madison, WI), and took 1 hour at 37°C. 

Three different libraries were made in the manner described above. 
Each contained the form of the FG loop with 10 randomized residues. The BC 
and the DE loops of the first libraty bore the wild type '°Fn3 sequence; a BC 
- roop-with 7 Randomized residueT and I wTfdlype'DE'loop made up the"second 
library; and a BC loop with 7 randomized residues and a DE loop with 4 
randomized residues made up the third library. The complexity of the FG loop 
in each of these three libraries was 10'^; the fiirther two randomized loops 
provided the potential for a complexity too large to be sampled in a laboratory. 

The three libraries constructed were combined into one master 
Hbrary in order to simplify the selection process; target binding itself was 
expected to select the most suitable library for a particular challenge. 
PROfiisions™ were obtained from the master library following the general 
procedure described in Szostak et al., U.S.S.N. 09/007,005 and 09/247,190; 
Szostak et al., WO98/31700; and Roberts & Szostak, Proc. Natl. Acad. Sci.' 
USA (1997) vol. 94, p. 12297-12302 (Figure 8). 



Fusion Splp^tifnfi 

The master library in the PROflision™ form was subjected to 
selection for binding to TNF-a. Two protocols were employed: one in which 
25 the target was immobilized on an agarose column and one in which the target 
was immobilized on a BIACORE chip. First, an extensive optimization of 
conditions to minimize background binders to the agarose column yielded the 
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favorable buffer conditions of 50 mM HEPES pH 7.4, 0.02% Triton, 100 ^g/ml 
Sheared Salmon Sperm DNA. In this buffer, the non-specific binding of the 
"'Fn3 RNA fusion to TNF-a Sepharose was 0.3%. The non-specific binding 
background of the '"FnS RNA-DNA to TNF-a Sepharose was found to be 
0.1%. 

During each round of selection on TNF-a Sepharose, the Profusion™ 
Iibrar> was first preincubated for an hour with underivatized Sepharose to 
remove an> remaining non-specific binders; the flow-through fi-om this pre- 
cleanni.' was incubated for another hour with TNF-a Sepharose. The TNF-a 
SepharoM.* wa.s washed for 3-30 minutes. 

After each selection, the PROfusion™ DNA that had been eluted 
from the solid support with 0.3 M NaOH or O.IM KOH was amplified by PGR; 
a DNA band of the expected size persisted through multiple rounds of selection 
( Figure *> ); similar results were observed in the two alternative selection 
protKoKs, and only the data from the agarose column selection is shown in 
Figure 9. 

In the first seven rounds, the binding of library PROfusions™ to the 
target remained low; in contrast, when free protein was translated from DNA 
pools at different stages of the selection, the proportion of the column binding 
species increased significantly between rounds (Figure 10). Similar selections 
may be carried out with any other binding species target (for example, IL-1 and 
IL-13). 

Animal .SfnHiPc 

Wild-type "°Fn3 contains an integrin-binding tripepetide motif, 
Arginine 78 - Glycine 79 - Aspartate 80 (the "RGD motif) at the tip of the FG 
loop. In order to avoid integrin binding and a potential inflammatory response 
based on this tripeptide in Yrm, a mutant form of '°Fn3 was generated that 
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contained an inert sequence, Serine 78 - Glycine 79 - Glutamate 80 (the "SGE 
mutant"), a sequence which is found in the closely related, wild-type '^Fn3 
domain. This SGE mutant was expressed as an N-terminally His^-tagged, free 
protein in coli, and purified to homogeneity on a metal chelate column 
5 followed by a size exclusion column. 

In particular, the DNA sequence encoding His^-^^FnSCSGE) was 
cloned into the pET9a expression vector and transformed into BL21 DE3 
pLysS cells. The culture was then grown in LB broth containing 50 |ig/mL 
kanamycin at 37*'C, with shaking, to A^^=\ .0, and was then induced with 0.4 

10 mM IPTG. TKe induced culture was further incubated, i^^^ 

conditions, overnight (14-18 hours); the bacteria were recovered by standard, 
low speed centrifugation. The cell pellet was resuspended in 1/50 of the 
original culture volume of lysis buffer (50 mM Tris 8.0, 0.5 M NaCl, 5% 
glycerol, 0.05% Triton X-100, and 1 mM PMSF), and the cells were lysed by 

15 passing the resulting paste through a Microfluidics Corporation Microfluidizer 
Ml 10-EH, three times. The lysate was clarified by centrifugation, and the 
supernatant was filtered through a 0.45 fim filter followed by filtration through 
a 0.2 |im filter. 100 mL of the clarified lysate was loaded onto a 5 mL Talon 
cobalt column (Clontech, Palo Aho, CA), washed by 70 mL of lysis buffer, and 

20 eluted with a linear gradient of 0-30 mM imidazole in lysis buffer. The flow 

rate through the column through all the steps was 1 mL/min. The eluted protein 
was concentrated 10-fold by dialysis (MW cutoff = 3,500) against 
15,000-20,000 PEG. The resulting sample was dialysed into buffer 1 (lysis 
buffer without the glycerol), then loaded, 5 mL at a time, onto a 16 x 60 mm 

25 Sephacryl 100 size exclusion column equilibrated in buffer 1. The column was 
run at 0.8 mL/min, in buffer 1; all fractions that contained a protein of the 
expected MW were pooled, concentrated lOX as described above, then 
dialyzed into PBS. Toxikon (MA) was engaged to perform endotoxin screens 
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and animal studies on the resulting sample. 

In these animal studies, the endotoxin levels in the samples examined 
to date have been below the detection level of the assay. In a preliminary 
toxicology study, this protein was injected into two mice at the estimated lOOX 
5 therapeutic dose of 2.6 mg/mouse. The animals survived the two \yeeks of the 
study with no apparent ill effects. These results suggest that ^^Fn3 may be 
incorporated safely into an IV drug. 

Alternati ve Constructs for In Vivo Use 

To extend the half life of the 8 kD '*^Fn3 domain, a larger molecule 

10 has also been constructed that mimics natural antibodies. This '^Fn3-F^ 

molecule contains the -CH1-CH2-CH3 (Figure 11) or -CH2-CH3 domains of the 
IgG constant region of the host; in these constructs, the ^^Fn3 domain is grafted 
onto the N-terminus in place of the IgG V^^ domain (Figures 1 1 and 12). Such 
antibody-like constructs are expected to improve the pharmacokinetics of the 

15 protein as well as its ability to harness the natural immune response. 

In order to construct the murine form of the ^^Fn3-CHj-CH2-CH3 
clone, the -CH,-CH2-CH3 region was first amplified from a mouse liver spleen 
cDNA library (Clontech), then ligated into the pET25b vector. The primers 
used in the cloning were 5' Fc Nest and 3' 5 Fc Nest, and the primers used to 

20 graft the appropriate restriction sites onto the ends of the recovered insert were 
5' Fc HIII and 3* Fc Nhe: 

5* Fc Nest 5*GCG GCA GGG TTT GCT TAG TGG GGC CAA GGG 3* (SEQ 
ID NO: 15); 

3' Fc Nest 5'GGG AGG GGT GG A GGT AGG TCA GAG TCC 3^ (SEQ ID 
25 NO: 16); 

3* Fc Nhe 5' TTT GCT AGC TTT ACC AGG AGA GTG GGA GGC 3* (SEQ 
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ID NO: 17); and 

5' Fc Hill 5" AAA AAG CTT GCC AAA ACG ACA CCC CCA TCT GTC 3' 
(SEQIDNO: 18). 

Further PCR is used to remove the CH, region from this clone and 
5 create the Fc part of the shorter, '"FnS-CHj-CHj clone. The sequence encoding 
" hn3 IS spliced onto the 5' end of each clone; either the wild type '°Fn3 cloned 
from the same mouse spleen cDNA library or a modified '"FnB obtained by 
mutJircnesis or randomization of the molecules can be used. The 
ohi-'onuclcoudes used in the cloning of murine wild-tj^e '°Fn3 were: 

10 Mo5PCR-NdeI: 

5* C'A I ATGGTTTCTGATATTCCGAGAGATCTGGAG 3' (SEQ ID NO: 19); 

Mo5PCR-His-NdeI (for an alternative N-terminus with the His. 
punficaiion tag): 

5' CAT ATG CAT CAC CAT CAC CAT CAC GTT TCT GAT ATT 
1 5 CCG AG A G 3' (SEQ ID NO: 20); and 
Mo3PCR-EcoRI: 5' 
GAATTCCTATGTTTTATAATTGATGGAAAC3' (SEQ ID NO: 21). 

The human equivalents of the clones are constructed using the same 
strategy with human oligonucleotide sequences. 
20 Other embodiments are within the claims. 

All publications, patents, and patent applications mentioned herein 
are hereby incorporated by reference. 

What is claimed is: 
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1 . A protein comprising a fibronectin type III domain having at least 
one randomized loop, said protein being characterized by its ability to bind to a 
compound that is not bound by the corresponding naturally- occurring 

5 fibronectin. 

2. The protein of claim 1, wherein said fibronectin type III domain is 
a mammalian fibronectin type III domain. 

3. The protein of claim 2, wherein said fibronectin type III domain is 
a human fibronectin type III domain. 

10 4. The protein of claim 1, wherein said protein comprises the tenth 

module of said fibronectin type III domain (^**Fn3). 



5. The protein of claim 4, wherein said compound binding is 
mediated by one *^Fn3 loop. 

6. The protein of claim 4, wherein said compound binding is 
15 mediated by two *^Fn3 loops. 

7. The protein of claim 4, wherein said compound binding is 
mediated by three *^Fn3 loops. 

8. The protein of claim 4, wherein the second loop of said *^Fn3 is 
extended in length relative to the naturally-occurring module. 
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9. The protein of claim 4, wherein said '°Fn3 lacks an integrin- 
binding motif. 



10. The protein of claim 9, wherein said integrin-binding motif is 
replaced by an amino acid sequence comprising a basic amino acid-neutral 
5 amino acid-acidic amino acid motif 



) 



11. The protein of claim 10, wherein said integrin-binding motif is 
replaced by an amino acid sequence comprismg serine-glycine-glutamate. 

12. The protein of claim 1, wherein said protein lacks disulfide 

bonds. 



13. The protein of claim 1. wherein said protein is part of a fusion 

protein. 



14. The protein of claim 1 3, wherein said fusion protein farther 
comprises an immunoglobulin F^. domain. 

15. The protein of claim 13, wherein said fusion protein further 
comprises a complement prote in. 

16. The protein of claim 1 3. wherein said fusion protein further 
comprises a toxin protein. 

1 7. The protein of claim 1 3. wherein said fusion protein further 
comprises an albumin protein. 
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18. The protein of claim 1, wherein said protein is covalently bound 
to a nucleic acid, 

19. The protein of claim 18, wherein said nucleic acid encodes said 

protein. 

5 20. The protein of claim 18, wherein said nucleic acid is RNA. 

21. The protein of claim 1 , wherein said protein is a multimer. 

22. The protein of claim 1 or 9, wherein said protein is formulated in 
a physiologically-acceptable carrier. 

23. A nucleic acid encoding the protein of claim 1 or 4. 

10 24. The nucleic acid of claim 23, wherein said nucleic acid is DNA. 

25. The nucleic acid of claim 23, wherein said nucleic acid is RNA. 

26. A method for generating a protein comprising a fibronectin type 
III domain which is pharmaceutical ly acceptable to a mammal, said method 
comprising removing an integrin-binding domain from said fibronectin type III 

15 domain. 

27. The method of claim 26, wherein said integrin binding motif is 
replaced by an amino acid sequence comprising a basic amino acid-neutral 
amino acid-acidic amino acid motif 
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28. The protein of claim 27, wherein said integrin-binding motif is 
replaced by an amino acid sequence comprising serine-glycine-giutamate. 

29. The method of claim 26, wherein said at least one loop of said 
fibronectin type III domain is randomized. 

30. The method of claim 26, wherein said protein comprises the 
tenth module of said fibronectin fype III domain. 

31. the protein of claim 26, wherein said protein is part of a fusion 

protein. 



32. The protein of claim 31, wherein said fusion protein further 
1 0 comprises an immunoglobulin domain. 



s 



33. The protein of claim 3 1 , wherein said fusion protein further 
comprises a complement protein. 

34. The protein of claim 31, wherein said fusion protein further 
comprises a toxin protein. 

35. The protein of claim 31, wherein said fusion protein further 
comprises an albumin protein. 

36. The method of claim 26, wherein said mammal is a human. 

37. A method for obtaining a protein which binds to a compound, 
said method comprising: 
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(a) contacting said compound with a candidate protein, said 
candidate protein comprising a fibronectin type III domain having at least one 
randomized loop, said contacting being carried out under conditions that allow 
compound-protein complex formation; and 

5 (b) obtaining, from said complex, said protein which binds to said 

compound. 

38. A method for obtaining a compound which binds to a protein, 
said protein comprising a fibronectin type III domain having at least one 
randomized loop, said method comprising: 
10 (a) contacting said protein with a candidate compound, said 

contacting being carried out under conditions that allow compound-protein 
complex formation; and 

(b) obtaining, from said complex, said compound which binds to said 

protein. 

15 39. The method of claim 37, said method further comprising 

randomizing at least one loop of said fibronectin type III domain of said protein 
obtained in step (b) and repealing said steps (a) and (b) using said further 
randomized protein. 

40. The method of claim 38, said method further comprising 

20 modifying said compound obtained in step (b) and repeating said steps (a) and 
(b) using said further modified compound. 

41. The method of claim 37 or 38, wherein said compound is a 

protein. 
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42. The method of claim 37 or 38, wherein said fibronectin type III 
domain is a mammahan fibronectin type III domain. 

43. The method of claim 42, wherein said fibronectin type III 
domain is a human fibronectin type III domain. 

5 44. The method of claim 37 or 38, wherein said protein comprises 

the tenth module of said fibronectin type III domain ('*^Fn3). 

45. The method of claim 44, wherein said compound binding is 
mediated by one *^Fn3 loop. 

46. The method of claim 44, wherein said compound binding is 
10 mediated by two *^Fn3 loops. 

47. The method of claim 44, wherein said compound binding is 
mediated by three *^Fn3 loops. 

48. The method of claim 44, wherein the second loop of said ^^Fn3 
is extended in length relative to the naturally-occurring module. 

15 49. The method of claim 44, wherein said ^^Fn3 lacks an integrin- ^ 

binding motif 

50. The method of claim 37, wherein said compound is immobilized 
on a solid support. 

5 1 . The method of claim 38, wherein said protein is immobilized on 
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a solid support. 

52. The method of claim 50 or 51, wherein said solid support is a 
column or microchip. 



8NSDOCID; <WO 0034784A1J_> 



wo 00/34784 



PCTAJS99/293I7 



1/12 





BNSOOCID; <WO 00347B4A1 



wo 00/34784 



2/12 



PCT/US99/29317 




BNSeXX;iD: <WO__ 0034784A1 I > 



BNSDOCID: <WO 0CG47e4Al_) 



wo 00/34784 ^^^2 PCT/US99/29317 



« « ri « 

X I B 2 




V H W >- 



:ic K M K 

to CO ^ W» 

U (3 U O 

a. (to «. 



a> U u 

V) CO 

r- 2 to 

Ul M 

C E 



p 



• — — 



- - 1 ^1 



c a a 

— >* 4, 

: 55!2 
5 ^ 





U U u 

> > " 

^> 

.J 



^ »- > ^ 

J- O C H 2 
^ >• — H O O 



I tr 

I ^ 

I > 

» a 

• CO 

* < 



i 



« < « « 



o 

> I 



M c 



U W M u O ^ 



> > 5 
o o a 

ft- c. o 



I faj U bj 

» a. c u 

I t/t ui 



I- H s r ^ 

f- O 2 > 5-» 

g< > -3 to 

„ W H C < 

U &• CL ^ 

fi* b t3 ^ r> ' 

> > 2 > - 

>- » b -C Jtf . 

w J > c> - 



o 

r > 

" ^ c c « 

;^ a u IN -3 

U £ H H < 

Lj < t*3 u: — 

= Vi (. H U 

H I O O U 



2 ^ U CL U Id 



t J< I- f- 



O to « « 



- >• >• 



cc cc GC oe K 




» • 1 > > -3 

C. C I O, £. c c 
H !- > j£ < < 
>•>*>•>•>*>•>< 

> > to to w — _a 

3 3 § d = = r 

fib a. I , Oi I I I I 

■ ■ > > < > 

•3* = o = = c£c: 

•4 J J ^ J J J 

1 I I to (o o 

O O O >. > to 

X o > > « c 

b o > > a i 
• > o a a — — 

to Cf) — M « c 

> > = H - 



O. 

r < — — . 
• u u l; 



U 2 w 



^ O a: 



Q ui tn 



>- u 



•J > r 

to r- 



> - -3 
^1 



O OJ 3 . 

^ o w „ 

C *J u « 

- — 4J 

i-» O C ^ 

o > 

^ ^ ^ ^ ifl 

5 = O -3 s 

« > c ^ c 

— — « ^ o 
^ o c 

o c cj cr» 

^ > > u ^ 

— !:! " c o 

u c « ^ o 

o a* _ 

-•J U w w ^ 

— I C C — , 

o c -c <a to 

^ o y o 

-< c — a 



M 



(o Q ja 

O O O 3 (O «M 



to 



o 

■D 



l4 

K 

> 



3 

O I- 

^ a w 

>• C C w 

O A> -^^ ^ Q. (J 

c c u o c c 

w CI o» a> -< ^ c 

o» c c o o 

♦0 "0 O O 10 (0 ^ 

n "* i! - <0 <0 3 

O U U. U H H 3 



c- fNi n 

< — = = c o - 

o u U. U. H => 



c 

O 

•O '-i V) ^ 

a z «t Q) 

1^ *o *c <i» a «o 

3 e o ^ o» -1 

•0 ID to O Q. «tt 

*J Vi g U *o *» 

U to O 3 

<o <o •» •» Q. 

-< 3 O O D 

^ C 3 10 6 X C 

o /a o- 3 O *- Ql 

Q O uj wi 5: o X 



*^ o « to u 

ffi O U (O = o X 



BNSDOCID: <WO. 



0034784A1 J .> 



wo 00/34784 



5/12 



PCTAJS99/29317 




BNSDCXID: <WO 0034784A1 J_> 



wo 00/34784 



6/12 



PCT/US99/29317 




BNSDOCID <WO _ 00347e'4Al J.> 



wo 00/34784 



7/12 



PCT/US99/29317 




BNSDOCiD: <WO 0CJ34784A1 I,> 



wo 00/34784 



8/12 



PCT/US99/29317 



mRNA-HissiopnS 



iOFn3 



HiSgiopNS 



10Fn3 



Ligated 



< < < < 

z z z z 

0,1 i. a 



Ligated 



< < < < 
Z Z 2 Z 




mRNA-iOFn3 



Figure 8 



BNSDOCID* <WO 



003473dAl t > 



wo 00/34784 



9/12 



PCT/US99/29317 




Figure 9 
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Figure 10 
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SEQUENCE LISTING 

< 11 0 > Phylos , inc . 

<12 0> PROTEIN SCAFFOLDS FOR ANTIBODY MIMICS 
AND OTHER BINDING PROTEINS 

<130> 50036/021WO2 

<150> 60/111,737 
<151> 1998-12-10 

<160> 21 

<170> FastSEQ for Windows Version 4.0 

<210> 1 
^11^ X22 ' ~ 
<212> DNA 

<213> Homo sapiens 
<400> 1 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atgcatcacc 60 
atcaccatca cgtttctgat gttccgaggg acctggaagt tgttgctgcg acccccacca 120 
y c 

<210> 2 
<211> 104 
<212> DNA 
<213> Homo sapiens 

<400> 2 



122 



ggaattccta atacgactca ctatagggac aattactatt tacaattaca atggtttctg 
atgttccgag ggacctggaa gttgttgctg cgacccccac cage 



60 
104 

<210> 3 
<211> 126 
<212> DNA 

<213> Homo sapiens 
<220> 

<22l> misc_feature 
<222> (1) . . . (126) 
<223> n = A,T,C or G 

<221> misc_feature 
<222> (1) . . . (126) 
<223> s = C or G 

<400> 3 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc cctgtttctc cgtaagtgat 60 
cgcagc"'^ "-nsnnsn nsnnsnnsnn snnccaactg atcagtaggc tggtgggggt ^20 

126 
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<210> 4 

<211> 62 

<212> DNA 

<213> Homo sapiens 

<400> 4 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc cctgtttctc cgtaagtgat 60 
cc 62 

<210> 5 

<211> 99 

<212> DNA 

<213> Homo sapiens 

<400> 5 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atgcatcacc 60 
atcaccatca cctcttcaca ggaggaaata gccctgtcc 99 

<210> 6 - 

<211> 132 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 
<222> (1) . . . (132) 
<223> n = A,T,C or G 

<221> misc_feature 
<222> (1) ... (132) 
<223> s = C or G 

<400> 6 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc gtataatcaa ctccaggttt 60 

aaggccgctg atggtagctg tsnnsnnsnn snnaggcaca gtgaactcct ggacagggct 12 0 

atttcctcct gt 132 

<210> 7 

<211> 64 

<212> DNA 

<213> Homo sapiens 

<400> 7 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc gtataatcaa ctccaggttt 60 
aagg 64 

<210> 8 

<211> 101 

<212> DNA 

<213> Homo sapiens 

<400> 8 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atgcatcacc 60 
atcaccatca cctcttctat accatcactg tgtatgctgt c 101 
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<210> 9 
<211> 114 
<212> DNA 

<213> Homo sapiens 
<220> 

<22l> misc_feature 
<222> (1) . , . (114) 
<223> n = A, T, C or G 

<221> misc_feature 
<222> (1) . . . (114) 
<2 2'3> s = C or G 

<400> 9 

agcggatgcc ttgtcgtcgt cgtccttgta gtccgttcgg taattaatgg aaattggsnn 60 
snnsnnsnns nnsnnsnnsn nsnnsnnagr gacagcatac acagtgatgg tata 114 

<210> 10 _ _ _ 

~<r2ii^' 57 ■ 

<212> DNA 

<213> Homo sapiens 
<400> 10 

agcggatgcc ttgtcgtcgt cgtccttgca gtccgttcgg taattaatgg aaattgg 57 

<210> 11 
<211> 45 
<212> DNA 

<213> T7 phage and tobacco mosaic virus 
<400> 11 

gcgtaatacg actcactata gggacaacta ccacttacaa ttaca 45 

<210> 12 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Flag sequence 
<400> 12 

agcggatgcc ttgtcgtcgt cgtcctcgca gtc 

<210> 13 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Splint oligonucleotide 

<22l> misc_feature 
<222> (1) . . . (19) 
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4 

<223> n = A,T,C or G 
<400> 13 

rtttcrtttn agcggatgc 19 

14 
20 
DNA 

Artificial Sequence 
■ 220 ' 

^ 22\ ■ Puromycin linker oligonucleotide 

a^i.i.i .« A<ftu ua^aaaaacc 2 0 




■ . : M . • : ■ 1 us 

4 - 1 ' 

gca-j I It Tr.'irttactg gggccaaggg 30 

<2 10 - If. 

<2l\ ' : " 

<2i: n?;A 

<213 Mur; mucculus 

<400 . If^ 

999^\}^1-J^' 1 {l-jggtaggtc acagtcc 27 

<210:- 17 

<211> JO 

<212> DliA 

<213> Hus mucculus 



<400> 17 

tttgctagct ttaccaggag agtgggaggc 
<210> 18 

<211> 33 - 

<212> DNA 

<213> Mus musculus 

<400> 18 

aaaaagcttg ccaaaacgac acccccatct gtc 

<210> 19 

<211> 33 

<212> DNA 

<213> Mus musculus 

<400> 19 

catatggttt ctgatattcc gagagatctg gag 



33 
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<210> 20 

<211> 43 

<212> DNA 

<213> Mus musculus 

<400> 20 

catatgcatc accatcacca tcacgtttct gatattccga gag 

<210> 21 
<211> 30 
<212> DNA 

<213> Mus musculus 



<400> 21 

gaattcctat gttttataat tgatggaaac 
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PROTEI N S CAFF O LDS F O R A NT IBOD Y M I M ICS 
5 AND OTHRR RINDING PROTFTNS 

Background of the Invention 
This invention relates to protein scaffolds useful, for example, for the 
generation of products having novel binding characteristics. 

Proteins having relatively defined three-dimensional structures, 

10 commonly referred to as protein scaffolds, may be used as reagents for the 

design of engineered products. These scaffolds typically contain one or more 
regions which are amenable to specific or random sequence variation, and such 
sequence randomization is often carried out to produce libraries of proteins 
from which desired products may be selected. One particular area in which 

1 5 such scaffolds are useful is the field of antibody design. 

A number of previous approaches to the manipulation of the 
mammalian immune system to obtain reagents or drugs have been attempted. 
These have included injecting animals with antigens of interest to obtain 
mixtures of polyclonal antibodies reactive against specific antigens, production 

20 of monoclonal antibodies in hybridoma cell culture (Koehler and Milstein, 
Nature 256:495, 1975), modification of existing monoclonal antibodies to 
obtain new or optimized recognition properties, creation of novel antibody 
fragments with desirable binding characteristics, and randomization of single 
chain antibodies (created by connecting the variable regions of the heavy and 

25 light chains of antibody molecules with a flexible peptide linker) followed by 
selection for antigen binding by phage display (Clackson et al., Nature 352:624, 
1991). 

In addition, several non-immunoglobulin protein scaffolds have been 
* proposed for obtaining proteins with novel binding properties. For example, a 
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"minibody" scaffold, which is related to the immunoglobulin fold, has been 
designed by deleting three beta strands from a heavy chain variable domain of a 
monoclonal antibody (Tramontano et al., J. Mol. Recognit. 7:9, 1994). This 
protein includes 61 residues and can be used to present two hypervariable 
5 loops. These two loops have been randomized and products selected for 

antigen binding, but thus far the framework appears to have somewhat limited 
utility due to solubility problems. Another framework used to display loops has 
been tendamistat, a 74 residue, six-strand beta sheet sandwich held together by 
two disulfide bonds (McConnell and Hoess, J. Mol. Biol. 250:460, 1995). This 

10 scaffold includes three loops, but, to date, only two of these loops have been 
examined for randomization potential. 

Other proteins have been tested as frameworks and have been used to 
display randomized residues on alpha helical surfaces (Nord et al., Nat. 
Biotechnol. 15:772, 1997; Nord et al.. Protein Eng. 8:601, 1995), loops 

15 between alpha helices in alpha helix bundles (Ku and Schultz, Proc, Natl. Acad. 
Sci. USA 92:6552, 1995), and loops constrained by disulfide bridges, such as 
those of the small protease inhibitors (Markland et ah, Biochemistry 35:8045, 
1996; Markland et al.. Biochemistry 35:8058, 1996; Rottgen and Collins, Gene 
164:243, 1995; Wang et al., J. Biol. Chem. 270:12250, 1995). 

20 Summary of the Invention 

The present invention provides a new family of proteins capable of 
evolving to bind any compound of interest. These proteins, which make use of 
a fibronectin or fibronectin-like scaffold, function in a manner characteristic of 
natural or engineered antibodies (that is, polyclonal, monoclonal, or 

25 single-chain antibodies) and, in addition, possess stmctural advantages. 

Specifically, the stmcture of these antibody mimics has been designed for 
optimal folding, stability, and solubility, even under conditions which normally 
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lead to the loss of structure and function in antibodies. 

These antibody mimics may be utilized for the purpose of designing 
proteins which are capable of binding to virtually any compound (for example, 
any protein) of interest. In particular, the fibronectin-based molecules 
5 described herein may be used as scaffolds which are subjected to directed 
evolution designed to randomize one or more of the three fibronectin loops 
which are analogous to the complementarity-determining regions (CDRs) of an 
antibody variable region. Such a directed evolution approach results in the 
production of antibody-like molecules with high affinities for antigens of 

10 interest. In addition, the scaffolds described herein may be used to display 

defined exposed loops (for example, loops previously randomized and selected 
on the basis of antigen binding) in order to direct the evolution of molecules - 
that bind to such introduced loops. A selection of this type may be carried out 
to identify recognition molecules for any individual CDR-like loop or, 

15 alternatively, for the recognition of two or all three CDR-like loops combined 
into a non-linear epitope. 

Accordingly, the present invention features a protein that includes a 
fibronectin type III domain having at least one randomized loop, the protein 
being characterized by its ability to bind to a compound that is not bound by the 

20 corresponding naturally-occumng fibronectin. 

In preferred embodiments, the fibronectin type III domain is a 
mammalian (for example, a human) fibronectin type III domain; and the protein 
includes the tenth module of the fibronectin type III (^^Fn3) domain. In such 
proteins, compound binding is preferably mediated by either one, two, or three 

25 ^^Fn3 loops. In other preferred embodiments, the second loop of '*^Fn3 may be 
extended in length relative to the naturally-occurring module, or the **^Fn3 may 
lack an integrin-binding motif. In these molecules, the integrin-binding motif 
may be replaced by an amino acid sequence in which a basic amino acid- 
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neutral amino acid-acidic amino acid sequence (in the N-terminal to C-temiinal 
direction) replaces the integrin-binding motif; one preferred sequence is serine- 
glycine-glutamate. In another preferred embodiment, the fibronectin type III 
domain-containing proteins of the invention lack disulfide bonds. 
5 Any of the fibronectin type II domain-containing proteins described 

herein may be formulated as part of a fiision protein (for example, a fusion 
protein which further includes an immunoglobulin domain, a complement 
protein, a toxin protein, or an albumin protein). In addition, any of the 
fibronectin type III domain proteins may be covalently bound to a nucleic acid 

10 (for example, an RNA), and the nucleic acid may encode the protein. 

Moreover, the protein may be a multimer, or, particularly if it lacks an integrin- 
binding motif, it may be formulated in a physiologically-acceptable carrier. 

The present invention also includes features proteins that include a 
fibronectin type III domain having at least one mutation in a p-sheet sequence 

1 5 which changes the scaffold structure. Again, these proteins are characterized 
by their ability to bind to compound that are not bound by the corresponding 
naturally-occurring fibronectin. 

In a related aspect, the invention further features nucleic acids 
encoding any of the proteins of the invention. In preferred embodiments, the 

20 nucleic acid is DNA or RNA. 

In another related aspect, the invention also features a method for 
generating a protein which includes a fibronectin type III domain and which is 
pharmaceutically acceptable to a mammal, involving removing the integrin- 
binding domain of said fibronectin type 111 domain. This method may be 

25 applied to any of the fibronectin type III domain-containing proteins described 
above and is particularly useful for generating proteins for human therapeutic 
applications. The invention also features such fibronectin type III domain- 
containing proteins which lack integrin-binding domains. 
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In yet other related aspects, the invention features screening methods 
which may be used to obtain or evolve randomized fibronectin type III proteins 
capable of binding to compounds of interest, or to obtain or evolve compounds 
(for example, proteins) capable of binding to a particular protein containing a 

5 randomized fibronectin type III motif In addition, the invention features 

screening procedures which combine these two methods, in any order, to obtain 
either compounds or proteins of interest. 

In particular, the first screening method, useful for the isolation or 
identification of randomized proteins of interest, involves : (a) contacting the 

10 compound with a candidate protein, the candidate protein including a 

fibronectin type III domain having at least one randomized loop, the contacting 
being carried out under conditions that allow compound-protein complex - 
formation; and (b) obtaining, from the complex, the protein which binds to the 
compound. 

1 5 The second screening method, for isolating or identifying a 

compound which binds to a protein having a randomized fibronectin type III 
domain, involves: (a) contacting the protein with a candidate compound, 
the contacting being carried out under conditions that allow compound-protein 
complex formation; and (b) obtaining, from the complex, the compound which 

20 binds to the protein. 

In preferred embodiments, the methods further involve either 
randomizing at least one loop of the fibronectin type III domain of the protein 
obtained in step (b) and repeating steps (a) and (b) using the further randomized 
protein, or modifying the compound obtained in step (b) and repeating steps (a) 

25 and (b) using the further modified compound. In addition, the compound is 
preferably a protein, and the fibronectin type III domain is preferably a 
mammalian (for example, a human) fibronectin type III domain. In other 
preferred embodiments, the protein includes the tenth module of the fibronectin 
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type III domain (^*^n3), and binding is mediated by one, two or three ^^Fn3 
loops. In addition, the second loop of *^n3 may be extended in length relative 
to the naturally-occurring module, or ^*^Fn3 may lack an integrin-binding motif. 
Again, as described above, the integrin-binding motif may be replaced by an 
5 amino acid sequence in which a basic amino acid-neutral amino acid-acidic 
amino acid sequence (in the N-terminal to C-terminal direction) replaces the 
integrin-binding motif; one preferred sequence is serine-glycine-glutamate. 

The selection methods described herein may be carried out using any 
fibronectin type III domain-containing protein. For example, the fibronectin 

10 type III domain-containing protein may lack disulfide bonds, or may be 
formulated as part of a fusion protein (for example, a fusion protein which 
further includes an immunoglobulin F^. domain, a complement protein, a toxin 
protein, or an albumin protein). In addition, selections may be carried out using 
the fibronectin type III domain proteins covalently bound to nucleic acids (for 

15 example, RNAs or any nucleic acid which encodes the protein). Moreover, the 
selections may be carried out using fibronectin domain-containing protein 
multimers. 

Preferably, the selections involve the immobilization of the binding 
target on a solid support. Preferred solid supports include columns (for 

20 example, affinity columns, such as agarose columns) or microchips. 

As used herein, by "fibronectin type III domain" is meant a domain 
having 7 or 8 beta strands which are distributed between two beta sheets, which 
themselves pack against each other to form the core of the protein, and further 
containing loops which connect the beta strands to each other and are solvent 

25 exposed. There are at least three such loops at each edge of the beta sheet 
sandwich, where the edge is the boundary of the protein perpendicular to the 
direction of the beta strands. Preferably, a fibronectin type III domain includes 
a sequence which exhibits at least 30% amino acid identity, and preferably at 
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Icast 50% amino acid identity, to the sequence encoding the structure of the 
^Tn3 domain referred to as "Ittg" (ID = "Ittg" (one ttg)) available from the 
Protein Data Base. Sequence identity referred to in this definition is 
determined by the Homology program, available from Molecular Simulation 
5 (San Diego, CA). The invention further includes polymers of *^Fn3-related 

molecules, which are an extension of the use of the monomer structure, whether 
or not ihc subunits of the polyprotein are identical or different in sequence. 

H> •^nanjrally occurring fibronectin" is meant any fibronectin protein 
that IN cncixicd by a living organism. 
10 h> "randomized" is meant including one or more amino acid 

alterations relative to a template sequence. 

B> a •^protein" is meant any sequence of two or more amino acids,-' 
rciiardless of length, post-translation modification, or function. "Protein" and 
"peptide*' are used interchangeably herein. 
1 5 B\' *'RNA" is meant a sequence of two or more covalently bonded,^^ 

naturally occurring or modified ribonucleotides. One example of a modified 
RNA meluded within this term is phosphorothioate RNA. 

By "DNA" is meant a sequence of two or more covalently bonded, 
naturally occurring or modified deoxyribonucleotides. 
20 By a "nucleic acid" is meant any two or more covalently bonded 

nucleotides or nucleotide analogs or derivatives. As used herein, this term 
includes, without limitation, DNA, RNA, and PNA. 

By "pharmaceutically acceptable" is meant a compound or protein 
that may be administered to an animal (for example, a mammal) without 
25 significant adverse medical consequences. 

By "physiologically acceptable carrier" is meant a carrier which does 
not have a significant detrimental impact on the treated host and which retains 
the therapeutic properties of the compound with which it is administered. One 
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exemplary physiologically acceptable carrier is physiological saline. Other 
physiologically acceptable carriers and their formulations are known to one 
skilled in the art and are described, for example, in Remington's 
Pharmaceutical Sciences . (18^*^ edition), ed. A. Gennaro, 1990, Mack Publishing 
5 Company, Easton, PA, incorporated herein by reference. 

By "selecting" is meant substantially partitioning a molecule from 
other molecules in a population. As used herein, a "selecting" step provides at 
least a 2-fold, preferably, a 30-fold, more preferably, a 100- fold, and, most 
preferably, a 1000-fold enrichment of a desired molecule relative to undesired 

10 molecules in a population following the selection step. A selection step may be 
repeated any number of times, and different types of selection steps may be 
combined in a given approach. 

By "binding partner," as used herein, is meant any molecule which 
has a specific, covalent or non-covalent affinity for a portion of a desired 

1 5 compound (for example, protein) of interest. Examples of binding partners 
include, without limitation, members of antigen/antibody pairs, 
protein/inhibitor pairs, receptor/1 igand pairs (for example cell surface . 
receptor/ligand pairs, such as hormone receptor/peptide hormone pairs), 
enzyme/substrate pairs (for example, kinase/substrate pairs), 

20 lectin/carbohydrate pairs, oligomeric or heterooligomeric protein aggregates, 
DNA binding protein/DNA binding site pairs, RNA/protein pairs, and nucleic 
acid duplexes, heteroduplexes, or ligated strands, as well as any molecule 
which is capable of forming one or more covalent or non-covalent bonds (for 
example, disulfide bonds) with any portion of another molecule (for example, a 

25 compound or protein). 

By a "solid support" is meant, without limitation, any column (or 
column material), bead, test tube, microtiter dish, solid particle (for example, 
agarose or sepharose), microchip (for example, silicon, silicon-glass, or gold 
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chip), or membrane (for example, the membrane of a liposome or vesicle) to 
which an affinity complex may be bound, either directly or indirectly (for 
example, through other binding partner intermediates such as other antibodies 
or Protein A), or in which an affinity complex may be embedded (for example, 
5 through a receptor or channel). 

The present invention provides a number of advantages. For 
example, as described in more detail below, the present antibody mimics 
exhibit improved biophysical properties, such as stability under reducing 
conditions and solubility at high concentrations. In addition, these molecules 

10 may be readily expressed and folded in prokaryotic systems, such as E. coli , in 
eukaryotic systems, such as yeast, and in in vitro translation systems, such as 
the rabbit reticulocyte lysate system. Moreover, these molecules are extremely 
amenable to affinity maturation techniques involving multiple cycles of 
selection, including in vitro selection using RNA-protein fusion technology 

15 (Roberts and Szostak, Proc. Natl. Acad. Sci USA 94:12297, 1997; Szostak et 
al., U.S.S.N. 09/007,005 and U.S. S.N. 09/247,190; Szostak et al. 
WO98/31700), phage display (see, for example. Smith and Petrenko, Chem. 
Rev. 97:317, 1997), and yeast display systems (see, for example, Boder and 
Wittrup, Nature Biotech. 15:553, 1997). 

20 Other features and advantages of the present invention will be 

apparent fi-om the following detailed description thereof, and from the claims. 

Brief Description of the Drawings 
FIGURE 1 is a photograph showing a comparison between the 
stmctures of antibody heavy chain variable regions ft-om camel (dark blue) and 
25 llama (light blue), in each of two orientations. 

FIGURE 2 is a photograph showing a comparison between the 
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siructures of the camel antibody heavy chain variable region (dark blue), the 
llama antibody heavy chain variable region (light blue), and a fibronectin type 
MI module number 10 (^^Fn3) (yellow). 

FIGURE 3 is a photograph showing a fibronectin type III module 
number 10 (*^Fn3), with the loops corresponding to the antigen-binding loops 
in lyG hca\ y chains highlighted in red. 

FIGURE 4 is a graph illustrating a sequence alignment between a 
ribrt>nccttn t\pc III protein domain and related protein domains. 

FIGURE 5 is a photograph showing the structural similarities 
between a ' I n} domain and 15 related proteins, including fibronectins, 
tenascins. et>Ilagcns, and undulin. In this photograph, the regions are labeled as 
follows: constant, dark blue; conserved, light blue; neutral, white; variable, red; 
and RGB intcgrin-binding motif (variable), yellow. 

FIGURE 6 is a photograph showing space filling models of 
fibronectin III modules 9 and 10, in each of two different orientations. The two 
modules and the integrin binding loop (RGB) are labeled. In this figure, blue 
indicates positively charged residues, red indicates negatively charged residues, 
and white indicates uncharged residues. 

FIGURE 7 is a photograph showing space filling models of 
fibronectin III modules 7-10, in each of three different orientiations. The four 
modules are labeled. In this figure, blue indicates positively charged residues, 
red indicates negatively charged residues, and white indicates uncharged 
residues. 

FIGURE 8 is a photograph illustrating the formation, under different 
salt conditions, of RNA-protein fusions which include fibronectin type III 
domains. 

FIGURE 9 is a series of photographs illustrating the selection of 
fibronectin type III domain-containing RNA-protein fusions, as measured by 
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PCR signal analysis. 

FIGURE 10 is a graph illustrating an increase in the percent TNF-a 
binding during the selections described herein, as well as a comparison between 
RNA-protein fusion and free protein selections. 
5 FIGURE 1 1 is a series of schematic representations showing IgG, 

'"FnS, Fn-CHi-CHj-CHj, and Fn-CH^-CHj (clockwise from top left). 

FIGURE 12 is a photograph showing a molecular model of Fn-CH,- 
CH.-CHj based on known three-dimensional structures of IgG (X-ray 
crystallography) and '"FnS (NMR and X-ray crystallography). 

10 petailed Description 

The novel antibody mimics described herein have been designed to 
be superior both to antibody-derived fragments and to non-antibody 
frameworks, for example, those frameworks described above. 

The major advantage of these antibody mimics over antibody 
1 5 fragments is structural. These scaffolds are derived from whole, stable, and 
soluble structural modules found in human body fluid proteins. Consequently, 
they exhibit better folding and thermostability properties than antibody 
fragments, whose creation involves the removal of parts of the antibody native 
fold, often exposing amino acid residues that, in an intact antibody, would be 
20 buried in a hydrophobic environment, such as an interface between variable and 
constant domains. Exposure of such hydrophobic residues to solvent increases 
the likelihood of aggregation. 

In addition, the antibody mimics described herein have no disulfide 
bonds, which have been reported to retard or prevent proper folding of antibody 
25 fragments under certain conditions. Since the present scaffolds do not rely on 
disulfides for native fold stability, they are stable under reducing conditions, 
unlike antibodies and their fragments which unravel upon disulfide bond 
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breakdown. 

Moreover, these fibronectin-based scaffolds provide the functional 
advantages of antibody molecules. In particular, despite the fact that the ^^Fn3 
module is not an immunoglobulin, its overall fold is close to that of the variable 
5 region of the IgG heavy chain (Figure 2), making it possible to display the three 
fibronectin loops analogous to CDRs in relative orientations similar to those of 
native antibodies. Because of this structure, the present antibody mimics 
possess antigen binding properties that are similar in nature and affinity to 
those of antibodies, and a loop randomization and shuffling strategy may be 
10 employed in vitro that is similar to the process of affinity maturation of 
antibodies in vivo . 

There are now described below exemplary fibronectin-based 
scaffolds and their use for identifying, selecting, and evolving novel binding 
proteins as well as their target ligands. These examples are provided for the 
15 purpose of illustrating, and not limiting, the invention. 

^n:^ Stnirtiir;.! Motif 

The antibody mimics of the present invention are based on the 
structure of a fibronectin module of type III (Fn3), a common domain found in 
mammalian blood and structural proteins. This domain occurs more than 400 

20 times in the protein sequence database and has been estimated to occur in 2% 
of the proteins sequenced to date, including fibronectins, tenscin, intracellular 
cytoskeletal proteins, and prokaryotic enzymes (Bork and Doolittle, Proc. Natl. 
Acad. Sci. USA 89:8990, 1992; Bork et al.. Nature Biotech. 15:553, 1997; 
Meinke et al., J. Bacteriol. 175:1910, 1993; Watanabe et al., J, Biol. Chem. 

25 265:15659, 1990). In particular, these scaffolds include, as templates, the tenth 
module of human Fn3 (**^Fn3), which comprises 94 amino acid residues. The 
overall fold of this domain is closely related to that of the smallest functional 
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antibody fragment, the variable region of the heavy chain, which comprises the 
entire antigen recognition unit in camel and llama IgG (Figure 1 , 2). The major 
differences between camel and llama domains and the **^Fn3 domain are that (i) 
^*^Fn3 has fewer beta strands (seven vs. nine) and (ii) the two beta sheets packed 
5 against each other are connected by a disulfide bridge in the camel and llama 
domains, but not in *^Fn3. 

The three loops of '^Fn3 corresponding to the antigen-binding loops 
of the IgG heavy chain run between amino acid residues 21-31, 51-56, and 
76-88 (Figure 3). The length of the first and the third loop, 1 1 and 12 residues, 
10 respectively, fall within the range of the corresponding antigen-recognition 
loops found in antibody heavy chains, that is, 10-12 and 3-25 residues, 
respectively. Accordingly, once randomized and selected for high antigen . 
affinity, these two loops make contacts with antigens equivalent to the contacts 
of the corresponding loops in antibodies. 
15 In contrast, the second loop of '^Fn3 is only 6 residues long, v^hereas 

the corresponding loop in antibody heavy chains ranges from 16-19 residues. 
To optimize antigen binding, therefore, the second loop of ^^Fn3 is preferably 
extended by 10-13 residues (in addition to being randomized) to obtain the 
greatest possible flexibility and affinity in antigen binding. Indeed, in general, 
20 the lengths as well as the sequences of the CDR-Hke loops of the antibody 
mimics may be randomized dunng in vitro or in vivo affinity maturation (as 
described in more detail below). 

The tenth human fibronectin type III domain, ^°Fn3, refolds rapidly 
even at low temperature; its backbone conformation has been recovered within 
25 1 second at 5°C. Thermodynamic stability of '^Fn3 is high (AG^ = 24 kJ/mol = 
5.7 kcal/mol), correlating with its high melting temperature of 1 lO^^C. 

One of the physiological roles of *^Fn3 is as a subunit of fibronectin, 
a glycoprotein that exists in a soluble form in body fluids and in an insoluble 
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form in the extracellular matrix (Dickinson et al., J. Mol. Biol. 236:1079, 
1994). A fibronectin monomer of 220-250 kD contains 12 type I modules, two 
type II modules, and 1 7 fibronectin type III modules (Potts and Campbell, Curr. 
Opin.Cell Biol. 6:648, 1994). Different type III modules are involved in the 
5 binding of fibronectin to integrins, heparin, and chondroitin sulfate. **^Fn3 was 
found to mediate cell adhesion through an integrin-binding Arg-Gly-Asp 
(RGD) motif on one of its exposed loops. Similar RGD motifs have been 
shown to be involved in integrin binding by other proteins, such as fibrinogen, 
von Wellebrand factor, and vitronectin (Hynes et al., Cell 69:1 1, 1992). No 

10 other matrix- or cell-binding roles have been described for *^Fn3. 

The observation that '^Fn3 has only slightly more adhesive activity 
than a short peptide containing RGD is consistent with the conclusion that the 
cell-binding activity of '^Fn3 is localized in the RGD peptide rather than 
distributed throughout the '^Fn3 structure (Baron et al.. Biochemistry 31:2068, 

1 5 1 992). The fact that ^^Fn3 without the RGD motif is unlikely to bind to other 
plasma proteins or extracellular matrix makes *^Fn3 a useful scaffold to replace 
antibodies. In addition, the presence of '*^Fn3 in natural fibrinogen in the 
bloodstream suggests that '*^Fn3 itself is unlikely to be immunogenic in the 
organism of origin. 

20 In addition, we have detemiined that the '^Fn3 framework possesses 

exposed loop sequences tolerant of randomization, facilitating the generation of 
diverse pools of antibody mimics. This determination was made by examining 
the flexibility of the ^*^Fn3 sequence. In particular, the human ^^Fn3 sequence 
was aligned with the sequences of fibronectins from other sources as well as 

25 sequences of related proteins (Figure 4), and the results of this alignment were 
mapped onto the three-dimensional structure of the human *^Fn3 domain 
(Figure 5). This alignment revealed that the majority of conserved residues are 
found in the core of the beta sheet sandwich, whereas the highly variable 
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residues are located along the edges of the beta sheets, including the N- and 
C-termini, on the solvent-accessible faces of both beta sheets, and on three 
solvent-accessible loops that serve as the hypervariable loops for affinity 
maturation of the antibody mimics. In view of these results, the randomization 

5 of these three loops are unlikely to have an adverse effect on the overall fold or 
stability of the ^^Fn3 framework itself. 

For the human ^^Fn3 sequence, this analysis indicates that, at a 
minimum, amino acids 1-9, 44-50', 61-54, 82-94 (edges of beta sheets); 19, 21, 
30-46 (even), 79-65 (odd) (solvent-accessible faces of both beta sheets); 21-31, 

10 51-56, 76-88 (CDR-like solvent-accessible loops); and 14-16 and 36-45 (other 
solvent-accessible loops and beta turns) may be randomized to evolve new or 
improved compound-binding proteins. In addition, as discussed above, ;? 
alterations in the lengths of one or more solvent exposed loops may also be ; 
included in such directed evolution methods. Alternatively, changes in the p- 

1 5 sheet sequences may also be used to evolve new proteins. These mutations 
change the scaffold and thereby indirectly alter loop structure(s). If this 
approach is taken, mutations should not saturate the sequence, but rather few 
mutations should be introduced. Preferably, no more than 10 amino acid 
changes, and, more preferably, no more than 3 amino acid changes should be 

20 introduced to the p-sheet sequences by this approach. 

Fibronectin Fusions 

The antibody mimics described herein may be fused to other protein 
domains. For example, these mimics may be integrated with the human 
immune response by fusing the constant region of an IgG (FJ with a *^Fn3 
25 module, preferably through the C-terminus of ^^Fn3. The F^ in such a *^Fn3-F^ 
fusion molecule activates the complement component of the immune response 
and increases the therapeutic value of the antibody mimic. Similarly, a fusion 
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between *^Fn3 and a complement protein, such as Clq, may be used to target 
cells, and a fusion between *^Fn3 and a toxin may be used to specifically 
destroy cells that carry a particular antigen. In addition, *^Fn3 in any form may 
be fused with albumin to increase its half-life in the bloodstream and its tissue 
5 penetration. Any of these fusions may be generated by standard techniques, for 
example, by expression of the fusion protein from a recombinant fusion gene 
constructed using publically available gene sequences. 

Fibronectin Scaffold Multimers 

In addition to fibronectin monomers, any of the fibronectin 

10 constructs described herein may be generated as dimers or multimers of 

**^Fn3-based antibody mimics as a means to increase the valency and thus the 
avidity of antigen binding. Such multimers may be generated through covalent 
binding between individual **^Fn3 modules, for example, by imitating the 
natural ^Fn3-^Fn3-^*^Fn3 C-to-N-terminus binding or by imitating antibody 

15 dimers that are held together through their constant regions. A *^Fn3-Fc 
construct may be exploited to design dimers of the general scheme of 
^^Fn3-Fc::Fc-^**Fn3. The bonds engineered into the Fc::Fc interface may be 
covalent or non-covalent. In addition, dimerizing or multimerizing partners 
other than Fc can be used in *^Fn3 hybrids to create such higher order 

20 stmctures. 

In particular examples, covalently bonded multimers may be 
generated by constmcting fusion genes that encode the multimer or, 
alternatively, by engineering codons for cysteine residues into monomer 
sequences and allowing disulfide bond formation to occur between the 
25 expression products. Non-covalently bonded multimers may also be generated 
by a variety of techniques. These include the introduction, into monomer 
sequences, of codons corresponding to positively and/or negatively charged 
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residues and allowing interactions between these residues in the expression 
products (and therefore between the monomers) to occur. This approach may 
be simplified by taking advantage of charged residues naturally present in a 
monomer subunit, for example, the negatively charged residues of fibronectin. 

5 Another means for generating non-covalently bonded antibody mimics is to 
introduce, into the monomer gene (for example, at the amino- or carboxy- 
termini), the coding sequences for proteins or protein domains known to 
interact. Such proteins or protein domains include coil-coil motifs, leucine 
zipper motifs, and any of the numerous protein subunits (or fragments thereof) 

1 0 known to direct formation of dimers or higher order multimers. 

Fihrnnectin -T ,ik<^ MnlRCules - 

Although '°Fn3 represents a preferred scaffold for the generation of 
antibody mimics, other molecules may be substituted for '°Fn3 in the molepules 
described herein. These include, without limitation, human fibronectin 

1 5 modules 'Fn3-'Fn3 and "Fn3-' 'Fn3 as well as related Fn3 modules from 
non-human animals and prokaryotes. In addition, Fn3 modules from other 
proteins with sequence homology to "'Fn3, such as tenascins and undulins, may 
also be used. Modules from different organisms and parent proteins may be 
most appropriate for different applications; for example, in designing an 

20 antibody mimic, it may be most desirable to generate that protein from a 
fibronectin or fibronectin-like molecule native to the organism for which a 
therapeutic or diagnostic molecule is intended. 

nirprtpd Fvolution nf Sraffol d - Rnsed Bin dinp Proteins 

The antibody mimics described herein may be used in any technique 
25 for evolving new or improved binding proteins. In one particular example, the 
target of binding is immobilized on a solid support, such as a column resin or 
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microtiter plate well, and the target contacted with a library of candidate 
scaffold-based binding proteins. Such a library may consist of '°Fn3 clones 
constructed from the wild type '°Fn3 scaffold through randomization of the 
sequence and/or the length of the '°Fn3 CDR-like loops. If desired, this library 
5 may be an RNA-protein fusion library generated, for example, by the 

techniques described in Szostak et al., U.S.S.N. 09/007,005 and 09/247,190; 
Szostak et al., W098/3 1700; and Roberts & Szostak, Proc. Natl. Acad. Sci. 
USA (1997) vol. 94, p. 12297-12302. Alternatively, it may be a DNA-protein 
library (for example, as described in Lohse, DNA-Protein Fusions and Uses 
10— Thereof,-U.S.S.N. 60/n0,5497fil^ Decem'bCT 2,T998~and" 

; „ J 

filed December 2, 1999). The fusion library is incubated with the immobilized 
target, the support is washed to remove non-specific binders, and the tightest 
binders are eluted under very stringent conditions and subjected to PGR to 
recover the sequence information or to create a new library of binders which 

15 may be used to repeat the selection process, with or without further 

mutagenesis of the sequence. A number of rounds of selection may be 
performed until binders of sufficient affinity for the antigen are obtained. 

In one particular example, the '°Fn3 scaffold may be used as the 
selection target. For example, if a protein is required that binds a specific 

20 peptide sequence presented in a ten residue loop, a single '"Fn3 clone is 

constructed in which one of its loops has been set to the length often and to the 
desired sequence. The new clone is expressed in vivo and purified, and then 
immobilized on a solid support. An RNA-protein fusion library based on an 
appropriate scaffold is then allowed to interact with the support, which is then 

25 washed, and desired molecules eluted and re-selected as described above. 

Similarly, the '"Fn3 scaffold may be used to find natural proteins that 
interact with the peptide sequence displayed in a '°Fn3 loop. The "'Fn3 protein 
is immobilized as described above, and an RNA-protein fusion library is 
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screened for binders to the displayed loop. The binders are enriched through 
muhiplc rounds of selection and identified by DNA sequencing. 

In addition, in the above approaches, although RNA-protein libraries 
represent exemplary libraries for directed evolution, any type of scaffold-based 
5 library- may be used in the selection methods of the invention. 

The antibody mimics described herein may be evolved to bind any 
antivrcn o( inicrcst. These proteins have thermodynamic properties superior to 
iho>c *>t natural antibodies and can be evolved rapidly in vitro . Accordingly, 

10 these antiKKls mimics may be employed in place of antibodies in all areas in 
which aniibixlics are used, including in the research, therapeutic, and diagnostic 
fields. In addition, because these scaffolds possess solubility and stability 
propcnics superior to antibodies, the antibody mimics described herein may 
also be used under conditions which would destroy or inactivate antibody 

15 molecules. Finally, because the scaffolds of the present invention may be 

evol\ cd to bind virtually any compound, these molecules provide completely 
novel binding proteins which also find use in the research, diagnostic, and 
therapeutic areas. 

Experimental Results 
20 Exemplary scaffold molecules described above were generated and 

tested, for example, in selection protocols, as follows. 

Library construction 

A complex library was constructed from three fragments, each of 
which contained one randomized area corresponding to a CDR-like loop. The 
25 fragments were named BC, DE, and FG, based on the names of the 
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CDR-H-like loops contained within them; in addition to *^Fn3 and a 
randomized sequence, each of the fragments contained stretches encoding an 
N-terminal His^ domain or a C-terminal FLAG peptide tag. At each junction 
between two fragments (i.e., between the BC and DE fi-agments or between the 
5 DE and FG fragments), each DNA fi-agment contained recognition sequences 
for the Earl Type IIS restriction endonuclease. This restriction enzyme allowed 
the splicing together of adjacent fragments while removing all foreign, 
non-^^Fn3, sequences. It also allows for a recombination-like mixing of the 
three '^Fn3 fragments between cycles of mutagenesis and selection. 

10 Each fragment was assembled from two overlapping 

oligonucleotides, which were first annealed, then extended to form the 
double-stranded DNA form of the fragment. The oligonucleotides that were 
used to construct and process the three fragments are listed below; the "Top" 
and "Bottom" species for each fragment are the oligonucleotides that contained 

15 the entire *^Fn3 encoding sequence. In these oligonucleotides designations, 
"N" indicates A, T, C, or G; and ''S" indicates C or G. 

HfnLbcTop (His): 

5'- GG AAT TCC TAA TAG GAG TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA ATG CAT CAC CAT CAC CAT CAC GTT TCT GAT 
20 GTT CCG AGG GAC CTG GAA GTT GTT GCT GCG ACC CCC ACC 
AGC-3'(SEQIDNO: 1) 

HfnLbcTop (an alternative N-terminus): 

5*- GG AAT TCC TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA ATG GTT TCT GAT GTT CCG AGG GAC CTG GAA 
25 GTT GTT GCT GCG ACC CCC ACC AGC-3' (SEQ ID NO: 2) 
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HFnLBCBot-flagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CCC TGT TTC TCC GTA AGT GAT CCT GTA ATA TCT (SNN)7 CCA 
GCT GAT CAG TAG GCT GGT GGG GGT CGC AGC -3* (SEQ ID NO: 3) 

5 HFnBC3'-flag8: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CCC TGT TTC TCC GTA AGT GAT CC-3' (SEQ ID NO: 4) 

HFnLDETop: 

5'- GG AAT TCC TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA 
10 TTT AC A ATT AC A ATG CAT CAC CAT CAC CAT CAC CTC TTC AG A 
GGA GGA AAT AGC CCT GTC C-3' (SEQ ID NO: 5) 

HFnLDEBot-OagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CGT ATA ATC AAC TCC AGG TTT AAG GCC GCT GAT GGT AGC TGT 
1 5 (SNN)4 AGG CAC AGT G AA CTC CTG GAC AGG GCT ATT TCC TCC 
TGT -3' (SEQ ID NO: 6) 

HFnDE3'-flag8: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC GCT CTT 
CGT ATA ATC AAC TCC AGG TTT AAG G-3' (SEQ ID NO: 7) 

20 HFnLFGTop: 

5'- GG AAT TCC TAA TAC GAC TCA CTA TAG GGA CAA TTA CTA 
TTT ACA ATT ACA ATG CAT CAC CAT CAC CAT CAC GTC TTC TAT 
ACC ATC ACT GTG TAT GCT GTC-3' (SEQ ID NO: 8) 
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HFnLFGBot-nag8: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC TGT TCG 
GTA ATT AAT GGA AAT TGG (SNN)IO AGT GAG AGC ATA CAC AGT 
GAT GGT ATA -3' (SEQ ID NO: 9) 

5 IIFiiKC;3'-nag8: 

5'- AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC TGT TCG 
C.TA \ri \ATGGA AATTGG-3'(SEQIDNO: 10) 

TTI m\ (introduces T7 promoter and TMV untranslated region needed for 
in \iiro translation): 

10 5'- ( iCG T A A r AC G AC TC A CTA TAG GGA C AA TTA CTA TTT AC A 
ATT ACA-?' (SEQ ID NO: 11) 

ASAnagS: 

5'-AGC GGA TGC CTT GTC GTC GTC GTC CTT GTA GTC-3' (SEQ ID 
NO: 12) 

15 Unispl-s (spint oligonucleotide used to ligate mRNA to the 

puromycin-containing linker, described by Roberts et al, 1997, supra): 

5'.TTTTTTTTTNAGCGGATGC-3' (SEQ ID NO: 13) 

A18— 2PEG (DNA-puromycin linker): 

5'-(A) 1 8(PEG)2CCPur (SEQ ID NO: 1 4) 

20 The pairs of oligonucleotides (500 pmol of each) were annealed in 

100 ^L of 10 mM Tris 7.5, 50 mM NaCl for 10 minutes at 85°C, followed by a 
slow (0.5-1 hour) cooling to room temperature. The annealed fragments with 
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single-stranded overhangs were then extended using 100 U Klenow (New 
England Biolabs, Beverly, MA) for each 100 aliquot of annealed oligos, and 
the buffer made of 838.5 ^1 H,0, 9 M Tris 7.5, 5 |il IM MgCl^, 20 ^l 10 
mM dNTPs, and 7.5 ^il IM DTT. The extension reactions proceeded for 1 hour 
5 at 25°C. 

Next, each of the double-stranded fragments was transformed into a 
RNA-protein fusion (PROfusion"^"^) using the technique developed by Szostak 
et al., U.S.S.N. 09/007,005 and U.S.S.N. 09/247,190; Szostak et al., 
WO98/31700; and Roberts & Szostak, Proc. Natl. Acad. Sci. USA (1997) vol. 
10 94, p. 12297-12302. Briefly, the fragments were transcribed using an Ambion 
in vitro transcription kit, MEGAshortscript (Ambion, Austin, TX), and the 
resulting mRNA was gel-purified and ligated to a DNA-puromycin linker using 
DNA ligase. The mRNA-DN A-puromycin molecule was then translated using 
the Ambion rabbit reticulocyte lysate-based translation kit. The resulting 
1 5 mRNA-DNA-puromycin-protein PROfusion™ was purified using Oligo(dT) 
cellulose, and a complementary DNA strand was synthesized using reverse 
transcriptase and the RT primers described above (Unisplint-S or flagASA), 
following the manufacturer's instructions. 

The PROfusion™ obtained for each fragment was next purified on 
20 the resin appropriate to its peptide purification tag, i.e., on Ni-NTA agarose for 
the His^-tag and M2 agarose for the KL AG-tag, following the procedure 
recommended by the manufacturer. The DNA component of the tag-binding 
PROfiisions™ was amplified by PGR using Pharmacia Ready-to-Go PGR 
Beads, 10 pmol of 5' and 3* PGR primers, and the following PGR program 
25 (Pharmacia, Piscataway, NJ): Step 1 : 95X for 3 minutes; Step 2: 95°G for 30 
seconds, 58/62°G for 30 seconds, 72X for 1 minute, 20/25/30 cycles, as 
required; Step 3: 72°G for 5 minutes: Step 4: 4°G until end. 

The resulting DNA was cleaved by 5 U Earl (New England Biolabs) 
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perl ug DNA; the reaction took place in T4 DNA Ligase Buffer (New England 
Biolabs) at 2>TC, for 1 hour, and was followed by an incubation at l(fC for 15 
minutes to inactivate Ear I. Equal amounts of the EC, DE, and FG fragments 
were combined and ligated to form a full-length ^*^Fn3 gene with randomized 
5 loops. The ligation required 10 U of fresh Earl (New England Biolabs) and 20 
U of T4 DNA Ligase (Promega, Madison, WI), and took 1 hour at "iTC. 

Three different libraries were made in the manner described above. 
Each contained the form of the FG loop with 10 randomized residues. The BC 
and the DE loops of the first library bore the wild type ^^Fn3 sequence; a BC 

10 loop with 7 randomized residues and a wild type DE loop made up the second 
library; and a BC loop with 7 randomized residues and a DE loop with 4 
randomized residues made up the third library. The complexity of the FG loop 
in each of these three libraries was 10*^; the further two randomized loops 
provided the potential for a complexity too large to be sampled in a laboratory. 

15 The three libraries constructed were combined into one master 

library in order to simplify the selection process; target binding itself was 
expected to select the most suitable library for a particular challenge. 
PROfusions™ were obtained from the master library following the general 
procedure described in Szostak et al., U. S.S.N. 09/007,005 and 09/247,190; 
20 Szostak et al., W098/3 1700; and Roberts & Szostak, Proc. Natl. Acad. Sci. 
USA (1997) vol. 94, p. 12297-12302 (Figure 8). 

Fusion Se lectiong 

The master library in the PROfusion™ form was subjected to 
selection for binding to TNF-a. Two protocols were employed: one in which 
25 the target was immobilized on an agarose column and one in which the target 
was immobilized on a BIACORE chip. First, an extensive optimization of 
conditions to minimize background binders to the agarose column yielded the 
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favorable buffer conditions of 50 mM HEPES pH 7.4, 0.02% Triton, 100 ^ig/ml 
Sheared Salmon Sperm DNA. In this buffer, the non-specific binding of the 
^*^Fn3 RNA fusion to TNF-a Sepharose was 0.3%. The non-specific binding 
background of the ^^Fn3 RNA-DNA to TNF-a Sepharose was found to be 
5 0.1%. 

During each round of selection on TNF-a Sepharose, the Profusion™ 
library was first preincubated for an hour with underivatized Sepharose to 
remove any remaining non-specific binders; the flow-through from this pre- 
clearing was incubated for another hour with TNF-a Sepharose. The TNF-a 

1 0 Sepharose was washed for 3-30 minutes. 

After each selection, the PROfusion™ DNA that had been eluted 
from the solid support with 0.3 M NaOH or 0.1 M KOH was amplified by PGR; 
a DNA band of the expected size persisted through multiple rounds of selection 
(Figure 9); similar resuhs were observed in the two alternative selection - v 

1 5 protocols, and only the data from the agarose column selection is shown in 
Figure 9. 

In the first seven rounds, the binding of library PROfusions™ to the 
target remained low; in contrast, when free protein was translated from DNA 
pools at different stages of the selection, the proportion of the column binding 
20 species increased significantly between rounds (Figure 10). Similar selections 
may be carried out with any other binding species target (for example, IL-1 and 
IL-13). 

Animal Studies 

Wild-type ^*^Fn3 contains an integrin-binding tripepetide motif, 
25 Arginine 78 - Glycine 79 - Aspartate 80 (the "RGD motif) at the tip of the FG 
loop. In order to avoid integrin binding and a potential inflammatory response 
based on this tripeptide in vivo , a mutant form of **^Fn3 was generated that 
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contained an inert sequence. Serine 78 - Glycine 79 - Glutamate 80 (the "SGE 
mutant"), a sequence which is found in the closely related, wild-type **Fn3 
domain. This SGE mutant was expressed as an N-terminally His^-tagged, free 
protein in Qohy and purified to homogeneity on a metal chelate column 
5 followed by a size exclusion column. 

In particular, the DNA sequence encoding His^-^^FnSCSGE) was 
cloned into the pET9a expression vector and transformed into BL21 DE3 
pLysS cells. The culture was then grown in LB broth containing 50 jig/mL 
kanamycin at 37''C, with shaking, to A^(^q=\,0, and was then induced with 0.4 

10 mM IPTG, The induced culture was further incubated, under the same 

conditions, overnight (14-18 hours); the bacteria were recovered by standard, 
low speed centrifiigation. The cell pellet was resuspended in 1/50 of the 
original culture volume of lysis buffer (50 mM Tris 8.0, 0.5 M NaCl, 5% 
glycerol, 0.05% Triton X-100, and 1 mM PMSF), and the cells were lysed by 

1 5 passing the resulting paste through a Microfluidics Corporation Microfluidizer 
M 1 10-EH, three times. The lysate was clarified by centrifiigation, and the 
supernatant was filtered through a 0.45 \im filter followed by filtration through 
a 0.2 urn filter. 100 mL of the clarified lysate was loaded onto a 5 mL Talon 
cobalt column (Clontech, Palo Alto, CA); washed by 70 mL of lysis buffer, and 

20 eluted with a linear gradient of 0-30 mM imidazole in lysis buffer. The flow 

rate through the column through all the steps was 1 mL/min. The eluted protein 
was concentrated 10-fold by dialysis (.MW cutoff = 3,500) against 
15,000-20,000 PEG. The resulting sample was dialysed into buffer 1 (lysis 
buffer without the glycerol), then loaded, 5 mL at a time, onto a 16 x 60 mm 

25 Sephacryl 100 size exclusion column equilibrated in buffer 1 . The column was 
run at 0.8 mL/min, in buffer 1 ; all fractions that contained a protein of the 
expected MW were pooled, concentrated lOX as described above, then 
dialyzed into PBS. Toxikon (MA) was engaged to perform endotoxin screens 
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and animal studies on the resulting sample. 

In these animal studies, the endotoxin levels in the samples examined 
to date have been below the detection level of the assay. In a preliminary 
toxicology study, this protein was injected into two mice at the estimated lOOX 
5 therapeutic dose of 2.6 mg/mouse. The animals survived the two weeks of the 
study with no apparent ill effects. These results suggest that ^^Fn3 may be 
incorporated safely into an IV drug. 

Alternative Constructs for In Viv Q Use 

To extend the half life of the 8 kD '^Fn3 domain, a larger molecule 

10 has also been constructed that mimics natural antibodies. This *^Fn3-F^ . v 
molecule contains the -CHJ-CH.-CH3 (Figure 1 1) or -CH2-CH3 domains of .the 
IgG constant region of the host; in these constructs, the *^Fn3 domain is grafted 
onto the N-terminus in place of the IgG domain (Figures 1 1 and 12). Such . 
antibody-like constructs are expected to improve the pharmacokinetics of the ^ 

1 5 protein as well as its ability to harness the natural immune response. 

In order to construct the murine form of the '^Fn3-CH,-CH2-CH3 
clone, the -CH1-CH2-CH3 region was first amplified from a mouse liver spleen 
cDNA library (Clontech), then ligated into the pET25b vector. The primers 
used in the cloning were 5* Fc Nest and 3* 5 Fc Nest, and the primers used to 

20 graft the appropriate restriction sites onto the ends of the recovered insert were 
5* Fc HIII and 3* Fc Nhe: 

5' Fc Nest 5'GCG GCA GGG TTT GCT TAG TGG GGC CAA GGG 3' (SEQ 
ID NO: 15); 

3' Fc Nest 5'GGG AGG GGT GGA GGT AGG TCA CAG TCC 3* (SEQ ID 
25 NO: 16); 

3' Fc Nhe 5' TTT GCT AGG TTT ACC AGG AGA GTG GGA GGC 3' (SEQ 
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IDNO: 17); and 

5' Fc HIII 5* AAA AAG CTT GCC AAA ACG ACA CCC CCA TCT GTC 3' 
(SEQIDNO: 18). 

Further PCR is used to remove the CH, region from this clone and 
5 create the Fc part of the shorter, **^Fn3-CH2-CH3 clone. The sequence encoding 
*^Fn3 is spliced onto the 5' end of each clone; either the wild type ^^Fn3 cloned 
from the same mouse spleen cDNA library or a modified **^Fn3 obtained by 
mutagenesis or randomization of the molecules can be used. The 
oligonucleotides used in the cloning of murine wild-type *^Fn3 were: 

10 Mo5PCR-NdeI: 

5' CATATGGTTTCTGATATTCCGAGAGATCTGGAG 3' (SEQ ID NO: 19); 

Mo5PCR-His-NdeI (for an alternative N-terminus with the His^ 
purification tag): 

5* CAT ATG CAT CAC CAT CAC CAT CAC GTT TCT GAT ATT 
15 CCG AGA G 3' (SEQ ID NO: 20); and 
Mo3PCR-EcoRI: 5* 
GAATTCCTATGTTTTATAATTGATGGAAAC3' (SEQ ID NO: 21). 

The human equivalents of the clones are constructed using the same 
strategy with human oligonucleotide sequences. 
20 Other embodiments are within the claims. 

All publications, patents, and patent applications mentioned herein 
are hereby incorporated by reference. 

What is claimed is: 
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Claims 

1 . A protein comprising a fibronectin type III domain having at least 
one randomized loop, said protein being characterized by its ability to bind to a 
compound that is not bound by the corresponding naturally- occurring 

5 fibronectin. 

2. The protein of claim 1, wherein said fibronectin type III domain is 
a mammalian fibronectin type III domain. 

3. The protein of claim 2, wherein said fibronectin type III domain is 
a human fibronectin type III domain. 

10 4. The protein of claim 1, wherein said protein comprises the tenth 

module of said fibronectin type III domain (*^Fn3). 

5. The protein of claim 4, wherein said compound binding is 
mediated by one *^Fn3 loop. 

6. The protein of claim 4, wherein said compound binding is 
15 mediated by two ^^Fn3 loops. 

7. The protein of claim 4, wherein said compound binding is 
mediated by three '*^Fn3 loops. 

8. The protein of claim 4, wherein the second loop of said *^Fn3 is 
extended in length relative to the naturally-occurring module. 
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9. The protein of claim 4, wherein said '^n3 lacks an integrin- 
binding motif. 

10. The protein of claim 9, wherein said integrin-binding motif is 
replaced by an amino acid sequence comprising a basic amino acid-neutral 

5 amino acid-acidic amino acid.motif 

11. The protein of claim 10, wherein said integrin-binding motif is 
replaced by an amino acid sequence comprising serine-glycine-glutamate. 

12. The protein of claim 1, wherein said protein lacks disulfide 

bonds. 

10 13. The protein of claim 1 , wherein said protein is part of a fusion 

protein. 

14. The protein of claim 13, wherein said fusion protein further 
comprises an immunoglobulin domain. 

15. The protein of claim 1 3, wherein said fusion protein further 
1 5 comprises a complement protein. 

16. The protein of claim 1 3, wherein said fusion protein further 
comprises a toxin protein. 

17. The protein of claim 13, wherein said fusion protein further 
comprises an albumin protein. 
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18. The protein of claim 1, wherein said protein is covalently bound 
to a nucleic acid. 

19. The protein of claim 18, wherein said nucleic acid encodes said 

protein. 

5 20. The protein of claim 18, wherein said nucleic acid is RNA. 

21 . The protein of claim 1, wherein said protein is a multimer. 

22. The protein of claim 1 or 9, wherein said protein is formulated in 
a physiologically-acceptable carrier. 

23. A nucleic acid encoding the protein of claim 1 or 4. 

10 24. The nucleic acid of claim 23, wherein said nucleic acid is DNA. 

25. The nucleic acid of claim 23, wherein said nucleic acid is RNA. 

26. A method for generating a protein comprising a fibronectin type 
III domain which is pharmaceutically acceptable to a mammal, said method 
comprising removing an integrin-binding domain from said fibronectin type III 

15 domain. 

27. The method of claim 26, wherein said integrin binding motif is 
replaced by an amino acid sequence comprising a basic amino acid-neutral 
amino acid-acidic amino acid motif. 
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28. The protein of claim 27, wherein said integrin-binding motif is 
replaced by an amino acid sequence comprising serine-glycine-glutamate, 

29. The method of claim 26, wherein said at least one loop of said 
fibronectin type III domain is randomized. 

5 30. The method of claim 26, wherein said protein comprises the 

tenth module of said fibronectin type III domain. 

3 1 . The protein of claim 26, wherein said protein is part of a fusion 

protein. 

32. The protein of claim 31, wherein said fusion protein further 
10 comprises an immunoglobulin F^, domain. 

33. The protein of claim 3 1 , wherein said fiision protein further 
comprises a complement protein. 

34. The protein of claim 3 1 , wherein said fusion protein further 
comprises a toxin protein. 

15 35. The protein of claim 3 1 , wherein said fusion protein further 

comprises an albumin protein. 

36. The method of claim 26, wherein said mammal is a human. 

37. A method for obtaining a protein which binds to a compound, 
said method comprising: 
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(a) contacting said compound with a candidate protein, said 
candidate protein comprising a fibronectin type III domain having at least one 
randomized loop, said contacting being carried out under conditions that allow 
compound-protein complex formation; and 
5 (b) obtaining, from said complex, said protein which binds to said 

compound. 

38. A method for obtaining a compound which binds to a protein, 
said protein comprising a fibronectin type 111 domain having at least one 
randomized loop, said method comprising: 
IQ (a) contacting said protein with a candidate compound, said 

contacting being carried out under conditions that allow compound-protein 

complex formation; and 

(b) obtaining, from said complex, said compound which binds to said 

protein. 

\ 5 39. The method of claim 37, said method further comprising 

randomizing at least one loop of said fibronectin type III domain of said protein 
obtained in step (b) and repeatmg said steps (a) and (b) using said further 
randomized protein. 

40. The method ofclaim 38, said method further comprising 

20 modifying said compound obtained in step (b) and repeating said steps (a) and 
(b) using said further modified compound. 

41. The method ofclaim 37 or 38, wherein said compound is a 

protein. 
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42. The method of claim 37 or 38, wherein said fibronectin t>pe III 
domain is a mammalian fibronectin type III domain. 

43. The method of claim 42, wherein said fibronectin type III 
domain is a human fibronectin type III domain. 

5 44. The method of claim 37 or 38, wherein said protein comprises 

ihc iL-nth miKlulc of said fibronectin type III domain ('*^Fn3). 

45. The method of claim 44, wherein said compound binding is 
mcdijicd b\ one **^Fn3 loop. 

46. The method of claim 44, wherein said compound binding is 
10 mediated by two '^Fn3 loops. 

47. The method of claim 44, wherein said compound binding is 
mediated by three *^Fn3 loops. 

48. The method of claim 44, wherein the second loop of said **^Fn3 
is extended in length relative to the naturally-occurring module. 

15 49. The method of claim 44, wherein said *^Fn3 lacks an integrin- 

binding motif 

50. The method of claim 37, wherein said compound is immobilized 
on a solid support. 

5 1 . The method of claim 38, wherein said protein is immobilized on 
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a solid support. 

52. The method of claim 50 or 5 1 , wherein said solid support is a 
column or microchip. 
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SEQUENCE LISTING 



<110> Phylos, Inc. 



<120> PROTEIN SCAFFOLDS FOR ANTIBODY MIMICS 
AND OTHER BINDING PROTEINS 



<130> 50036/021WO2 

<150> 60/111,737 

<151> 1998-12-10 

<160> 21 

<170> FastSEQ for Windows Version 4.0 

-< 21-0 >- -1 — 

<211> 122 

<212> DNA 

<213> Homo sapiens 



<400> 1 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atgcatcacc 
atcaccatca cgtttctgat gttccgaggg acctggaagt tgttgctgcg acccccacca 
gc 

<210> 2 

<211> 104 

<212> DNA 

<213> Homo sapiens 

<400> 2 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atggtttctg 
atgttccgag ggacctggaa gttgttgctg cgacccccac cage 

<210> 3 
<211> 126 
<212> DNA 

<213> Homo sapiens 



60 
120 
122 



60 
104 



<220> 

<221> misc_feature 
<222> (1) . . . (126) 



<223> n 



A,T,C or G 



<221> misc feature 



<222> 
<223> 



(1) . . . (126) 
s = C or G 



<400> 3 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc cctgtttctc cgtaagtgat 
cctgtaatat ctsnnsnnsn nsnnsnnsnn snnccagctg atcagtaggc tggtgggggt 
cgcagc 



60 
120 
126 
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<210> 4 

<211> 62 

<212> DNA 

<213> Homo sapiens 



<400> 4 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc cctgtttctc cgtaagtgat 60 



<210> 5 

<211> 99 

<212> DNA 

<213 > Homo sapiens 

<400> 5 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atgcatcacc 
atcaccatca cctcttcaca ggaggaaata gccctgtcc 



<210> 6 

<211> 132 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 

<222> (1) . . . (132) 

<223> n = A,T,C or G 

<221> misc_feature 
<222> (1) . . . (132) 
<223>s=CorG 

<400> 6 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc gtataatcaa ctccaggttt 

aaggccgctg atggtagctg tsnnsnnsnn snnaggcaca gtgaactcct ggacagggct 

atttcctcct gt 

<210> 7 

<211> 64 

<212> DNA 

<213> Homo sapiens 



<400> 7 

agcggatgcc ttgtcgtcgt cgtccttgta gtcgctcttc gtataatcaa ctccaggttt 60 
aagg ^4 

<210> 8 

<211> 101 

<212> DNA 

<213> Homo sapiens 



<400> 8. 

ggaattccta atacgactca ctatagggac aattactatt tacaattaca atgcatcacc 
atcaccatca cctcttctat accatcactg tgcatgctgt c 
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<210> 9 
<211> 114 
<212> DNA 

<;213> Homo sapiens 
<220> 

<22 1> misc_f eature 
<222> (1) . . . (114) 
<223>n=A,T>CorG 

<22 1 > mi sc_f eature 
<222> (1) . . . (114) 
<223> s = C or G 

<400> 9 

agcggatgcc ttgtcgtcgt cgtccttgca gtctgttcgg taattaatgg aaattggsnn 60 
snnsnnsnns nnsnnsnnsn nsnnsnnagt gacagcatac acagtgatgg tata 114 

<211> 57 
<212> DNA 

< 2 1 3 > Homo sapiens 
<400> 10 

agcggatgcc ttgtcgtcgt cgtccttgca gtctgttcgg taattaatgg aaattgg 57 

<210> 11 
<211> 45 
<212> DNA 

<213> T7 phage and tobacco mosaic virus 
<400> 11 

gcgtaatacg actcactata gggacaatca ctarttacaa ttaca 45 

<210> 12 
<211> 33 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Flag sequence 
<400> 12 

agcggatgcc ttgtcgtcgt cgtccttgta qtc 33 

<210> 13 
<211> 19 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Splint oligonucleotide 

<221> mi sc_f eature 
<222> (1) . . . (19) 
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<223> n = A,T,C or G 
<400> 13 

tttttttttn agcggatgc 19 

<210> 14 
<211> 20 
<212> DNA 

<213> Artificial Sequence 
<226> 

<223> Puromycin linker oligonucleotide 
<400> 14 

aaaaaaaaaa aaaaaaaacc 2 0 

<210> 15 

<211> 30 

<212> DNA 

<2 13 > Mus musculus 

<400> 15 

gcggcagggt ttgcttactg gggccaaggg 3 0 

<210> 16 

<211> 27 

<212> DNA 

<213> Mus musculus 

<400> 16 

gggaggggtg gaggtaggtc acagtcc 27 

<210> 17 

<211> 30 

<212> DNA 

<213> Mus musculus 

-<400> 17 

tttgctagct ttaccaggag agtgggaggc 3 0 

<210> 18 

<211> 33 

<212> DNA 

<213> Mus musculus 

<400> 18 

aaaaagcttg ccaaaacgac acccccatct gtc 33 

<210> 19 

<211> 33 

<212> DNA 

<213> Mus musculus 

<400> 19 

catatggttt ctgatattcc gagagatctg gag 3 3 
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<210 
<211 
<212 
<213 



> 



> 



> 



> 



43 
DNA 

Mus musculus 



20 



<400 



> 



20 



catatgcatc accatcacca tcacgtctct gatattccga gag 



<210> 


21 


<211> 


30 


<212> 


DNA 


<213> 


Mus musculus 


<400> 


21 



gaattcctat gttttataat tgatggaaac * 
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